Skip to content

LinkedInLearning/using-large-datasets-with-pandas-4467955

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Using Large Datasets with pandas

This is the repository for the LinkedIn Learning course Using Large Datasets with pandas. The full course is available from LinkedIn Learning.

lil-thumbnail-url

As data grows in size and complexity, most enterprises start to think about how to migrate to a larger-format data system such as Spark. However, this move can be quite painful, and you’ll most likely need to learn an entirely new set of tools. In this course, join instructor Miki Tebeka to learn how to get started working with large datasets using pandas, the fast, powerful, flexible, and easy-to-use data analysis tool built on top of the Python programming language. Find out how to navigate storage formats, tips for saving memory, efficient memory computation strategies, and more. Along the way, Miki also demonstrates how to leverage a handful of alternatives to pandas that still use the same API, such as Dask, Polars, and Beefy VM.

This course is integrated with GitHub Codespaces, an instant cloud developer environment that offers all the functionality of your favorite IDE without the need for any local machine setup. With Codespaces, you can get hands-on practice from any machine, at any time—all while using a tool that you’ll likely encounter in the workplace. Check out the “Using GitHub Codespaces with this course” video to learn how to get started.

Instructor

Miki Tebeka

CEO at 353Solutions

Check out my other courses on LinkedIn Learning.

About

This repo is for linkedin learning course: Using Large Datasets with pandas

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published