Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Adding a multivariate anomaly detection sample notebook #1365

Merged
merged 21 commits into from Feb 2, 2022
Merged

docs: Adding a multivariate anomaly detection sample notebook #1365

merged 21 commits into from Feb 2, 2022

Conversation

yalaudah
Copy link
Contributor

@yalaudah yalaudah commented Jan 26, 2022

SynapseML v0.9.5 added support for multivariate anomaly detection. Here's a sample notebook that hopefully helps demonstrate this capability.

You can view the rendered version of the notebook here.

Note: Please let me know where I can upload the sample.csv file I'm using in the notebook. I've seen other notebooks in the repo use wasbs://publicwasb@mmlspark.blob.core.windows.net/iot but I don't have access to upload any file there.

Yazeed Alaudah added 4 commits January 25, 2022 18:20
@yalaudah yalaudah changed the title Adding a multivariate anomaly detection sample notebook docs: Adding a multivariate anomaly detection sample notebook Jan 26, 2022
@yalaudah
Copy link
Contributor Author

I just updated the URL of the sample CSV to point to wasbs://publicwasb@mmlspark.blob.core.windows.net/MVAD/sample.csv.

Thanks @serena-ruan !

@serena-ruan
Copy link
Contributor

serena-ruan commented Jan 26, 2022

Hey @yalaudah, could you update that first two cells of getting anomaly_key and connectionString to something like this notebook (https://github.com/microsoft/SynapseML/blob/master/notebooks/features/cognitive_services/CognitiveServices%20-%20Overview.ipynb)
image
Just anomaly_key and connectionString though. Thanks!

@serena-ruan
Copy link
Contributor

Also, after you address the above comment, please delete this spark creation part:
image
spark = pyspark.sql.SparkSession.builder.appName("MyApp")
.config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.9.5")
.config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven")
.getOrCreate()
We don't need this anymore, and we will install the corresponding version automatically during tests; for customers, they should follow the guidance in website installation section to install the package.
(Sorry I'm leaving comments like this because the notebook doesn't render correctly on my side)

@yalaudah
Copy link
Contributor Author

Thanks @serena-ruan! I've updated the notebook to address the two comments above.

@serena-ruan
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented Jan 27, 2022

Codecov Report

Merging #1365 (fa9ca72) into master (0840e31) will decrease coverage by 18.02%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #1365       +/-   ##
===========================================
- Coverage   83.71%   65.68%   -18.03%     
===========================================
  Files         287      268       -19     
  Lines       14239    13255      -984     
  Branches      728      728               
===========================================
- Hits        11920     8707     -3213     
- Misses       2319     4548     +2229     
Impacted Files Coverage Δ
...la/org/apache/spark/ml/param/TypedArrayParam.scala 0.00% <0.00%> (-100.00%) ⬇️
.../microsoft/azure/synapse/ml/io/binary/Binary.scala 0.00% <0.00%> (-100.00%) ⬇️
...m/microsoft/azure/synapse/ml/stages/Batchers.scala 3.70% <0.00%> (-92.60%) ⬇️
.../execution/streaming/continuous/HTTPSourceV2.scala 0.00% <0.00%> (-92.09%) ⬇️
...che/spark/sql/execution/streaming/HTTPSource.scala 0.00% <0.00%> (-90.00%) ⬇️
...ql/execution/streaming/continuous/HTTPSinkV2.scala 0.00% <0.00%> (-89.75%) ⬇️
...soft/azure/synapse/ml/cognitive/AudioStreams.scala 0.00% <0.00%> (-87.88%) ⬇️
...ft/azure/synapse/ml/cognitive/AzureSearchAPI.scala 2.59% <0.00%> (-87.02%) ⬇️
...synapse/ml/cognitive/TextAnalyticsSDKSchemas.scala 0.00% <0.00%> (-81.20%) ⬇️
...osoft/azure/synapse/ml/cognitive/AzureSearch.scala 7.19% <0.00%> (-80.58%) ⬇️
... and 92 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0840e31...fa9ca72. Read the comment docs.

@mhamilton723
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723
Copy link
Collaborator

Hey @yalaudah thank you so much for this PR. I gave you access to queue builds with /azp run.

@serena-ruan
Copy link
Contributor

Hi @yalaudah For the website to successfully render the notebook, I suggest you sending over the image to me and I could upload it to the storage account for you, then we just render it with a link. BTW, I should have fixed that databricks key error, and this image stuff should be our last step to get this PR in :)

@yalaudah
Copy link
Contributor Author

Thanks @mhamilton723 and @serena-ruan!

@serena-ruan, I've sent you the image, please let me know if there is anything else I can help with.

@yalaudah
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mhamilton723
Copy link
Collaborator

@yalaudah this looks like an error due to secrets not being in databricks env. i added and requeued

@mhamilton723
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

yalaudah commented Feb 1, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

yalaudah commented Feb 1, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

yalaudah commented Feb 1, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

yalaudah commented Feb 2, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yalaudah
Copy link
Contributor Author

yalaudah commented Feb 2, 2022

Updated the initialization script to fix an issue with the BLOB_CONNECTION_STRING not being parsed correctly

@mhamilton723 mhamilton723 merged commit d602c7b into microsoft:master Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants