Skip to content

LinkedInLearning/apache-pyspark-by-example-802868

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache PySpark by Example

This is the repository for the LinkedIn Learning course Apache PySpark by Example. The full course is available from LinkedIn Learning.

Want to get up and running with Apache Spark as soon as possible? If you're well versed in Python, the Spark Python API (PySpark) is your ticket to accessing the power of this hugely popular big data platform. This practical, hands-on course helps you get comfortable with PySpark, explaining what it has to offer and how it can enhance your data science work. To begin, instructor Jonathan Fernandes digs into the Spark ecosystem, detailing its advantages over other data science platforms, APIs, and tool sets. Next, he looks at the DataFrame API and how it's the platform's answer to many big data challenges. Finally, he goes over Resilient Distributed Datasets (RDDs), the building blocks of Spark.

Instructor

Jonathan Fernandes

Data Scientist

Check out my other courses on LinkedIn Learning.

About

This repo is for the Linkedin Learning course: Apache PySpark by Example

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published