New Python clustering tutorial #271

NelGson · 2017-08-18T03:50:33Z

No description provided.

added ML Services Python

uc-msft · 2017-08-18T03:57:12Z

...machine-learning-services/python/getting-started/customer-clustering/customer_clustering.sql

+from sklearn.cluster import KMeans
+
+#get data from input query
+customer_data = my_input_data


You can remove this line after modifying @input_data_1.

uc-msft · 2017-08-18T03:57:21Z

...machine-learning-services/python/getting-started/customer-clustering/customer_clustering.sql

+OutputDataSet = customer_data
+'
+	, @input_data_1 = @input_query
+	, @input_data_1_name = N'my_input_data'


Change to 'customer_data'.

uc-msft · 2017-08-18T03:57:58Z

...machine-learning-services/python/getting-started/customer-clustering/customer_clustering.sql

+clusters = est.labels_
+customer_data["cluster"] = clusters
+
+OutputDataSet = customer_data


You can remove this & use @output_data_1_name = 'customer_data' instead. This will output the dataframe back to SQL.

uc-msft · 2017-08-18T03:58:34Z

...machine-learning-services/python/getting-started/customer-clustering/customer_clustering.sql

+-- Stored procedure that performs customer clustering using Python and SQL Server ML Services
+DROP PROCEDURE IF EXISTS [dbo].[py_generate_customer_return_clusters]
+GO
+CREATE procedure [dbo].[py_generate_customer_return_clusters]


Change to CREATE OR ALTER and remove the DROP PROCEDURE IF EXISTS.

uc-msft · 2017-08-18T04:00:35Z

...machine-learning-services/python/getting-started/customer-clustering/customer_clustering.sql

+  JOIN
+  [dbo].[py_customer_clusters] as c
+  ON c.Customer = customer.c_customer_sk
+  WHERE c.cluster = 0;


Seems like there is no newline here or there is some special character. Can you add a carriage return after ; to make sure it runs?

uc-msft · 2017-08-18T04:02:18Z

...chine-learning-services/python/getting-started/customer-clustering/customer_clustering_ng.py

+        "frequency": {"type": "integer"}
+    }
+
+    data_source = RxSqlServerData(sql_query=input_query, column_Info=column_info, connection_string=conn_str)


I think it should be column_info all lowercase.

uc-msft · 2017-08-18T04:03:03Z

...chine-learning-services/python/getting-started/customer-clustering/customer_clustering_ng.py

+    }
+
+    data_source = RxSqlServerData(sql_query=input_query, column_Info=column_info, connection_string=conn_str)
+    RxInSqlServer(connection_string=conn_str, num_tasks=1, auto_cleanup=False)


You can remove this line. It is not needed since we are not using SQL compute context.

uc-msft · 2017-08-18T04:04:05Z

...chine-learning-services/python/getting-started/customer-clustering/customer_clustering_ng.py

+    print(customer_data.groupby(['cluster']).mean())
+
+
+perform_clustering()


I see some special character at the end. Can you add carriage return at end?

NelGson · 2017-08-18T04:47:31Z

Updated according to comments

negust-microsoft added 2 commits August 17, 2017 20:47

Added new Python tutorial

53d6201

Updated Feature readme

646e7c7

added ML Services Python

msftclas added the cla-required label Aug 18, 2017

negust-microsoft added 2 commits August 17, 2017 21:10

added empty line at end of file

dc7df49

added empty line at end of file

310fedb

uc-msft reviewed Aug 18, 2017

View reviewed changes

negust-microsoft added 2 commits August 17, 2017 21:44

Updated customer clustering .py and .sql files

104c56a

Updated clustering python .py and .sql files

68991ca

Removed .vs folder and updated python sql file

21a5601

uc-msft merged commit 7171589 into microsoft:master Aug 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Python clustering tutorial #271

New Python clustering tutorial #271

Uh oh!

NelGson commented Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

uc-msft Aug 18, 2017

Uh oh!

NelGson commented Aug 18, 2017

Uh oh!

Uh oh!

		print(customer_data.groupby(['cluster']).mean())


		perform_clustering() No newline at end of file

New Python clustering tutorial #271

New Python clustering tutorial #271

Uh oh!

Conversation

NelGson commented Aug 18, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NelGson commented Aug 18, 2017

Uh oh!

Uh oh!