Skip to content

Latest commit

 

History

History
141 lines (109 loc) · 13.5 KB

Dec 31 2020 - Azure Databricks documentation, learning materials and additional resources.md

File metadata and controls

141 lines (109 loc) · 13.5 KB

Dec 31 2020 - Azure Databricks documentation, learning materials and additional resources

Azure Databricks repository is a set of blogposts as a Advent of 2020 present to readers for easier onboarding to Azure Databricks!

Series of Azure Databricks posts:

In the last two days we have focused on understanding Apache Spark through performance tuning and through troubleshooting. Both require some deeper understanding of Spark and Azure Databricks, but gives also a great insight to all who will need to improve performance and work with Spark.

Today, I would like to list couple of additional Learning material, documentation and any other additional resources for further exploration on Azure Databricks.

Databricks / Azure Databricks

Good way to start with your learning path is the vendor documentation: https://docs.databricks.com/.

Microsoft has created another great documentation for Databricks Azure: https://docs.microsoft.com/en-gb/azure/databricks/

Databricks are vendor agnostic and one should also look AWS offerings and documentation: https://databricks.com/aws

Check the Github for great examples and documentation on Databricks and all related content:
- https://github.com/databricks
- https://github.com/MicrosoftLearning/databricks-intro

Spark

Apache Spark offers extensive and great documentation on the Apache Spark website:
- https://spark.apache.org/docs/latest/index.html
- Spark: The Definitive Guide: Big Data Processing Made Simple
- Learning Spark: Lightning-Fast Big Data Analysis
- High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

Gihub:
- https://github.com/MicrosoftLearning/databricks-intro
- https://microsoftlearning.github.io/databricks-ml/
- https://github.com/tomaztk/Azure-Databricks

Machine Learning (MLlib)

Great documentation can be found at: https://spark.apache.org/mllib/

Some Github examples of using Machine Learning and MLlib:
- https://github.com/databricks/tech-talks
- https://github.com/dennyglee/databricks

Certifications or trainings:
- Microsoft - https://docs.microsoft.com/en-us/azure/databricks/getting-started/training-faq
- Databricks - great way to get yourself certified: https://academy.databricks.com/category/certifications
- Amazon - https://databricks.com/p/webinar/aws-databricks-training-series

Certification is also a good way to get to know with the product and features Databricks certifications are fun!

There are also many online courses one should check and also great courses from many training companies.

As always, complete set of code and the Notebook is available at the Github repository.

Happy Coding and Stay Healthy! And Happy New year 2021! Wish you all the best!