Skip to content

commit-live-students/pandas_in_class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Manipulation with pandas DataFrame

GitHub Logo

Many real-world data sets contain strings, integers, time-stamps and unstructured data. How do you store data like this so that you can manipulate it and easily retrieve important information? The answer is in a pandas DataFrame!

At a glance

  • In Class Instruction: 4 Hours
    • In Class code along Dataset: Weather Dataset
  • Project Dataset: Indian Premier League
    • Estimated Time to complete Project Tasks: 1 Hours
    • Total sub tasks within the Project: 6
    • Complexity of sub tasks : Mid to High
    • Points to be scored : 700
  • Why should you care about this project: This project challenges you to manipulate large datasets without using conventional programming techniques to extract business insights.
  • Skills Rehearsed
    • Python

In-Class Activities

  • Instructor led concept onboarding
  • Code Alongs
  • In Class Quiz Administration
  • Periodic Recap - Closer to the end of session
  • In Class Assignments - Motivation
  • Take Away Assignments

Why complete this?

  • You will become acquainted with the powertool of pandas - the DataFrame. You will learn how to use pandas to import and then inspect a variety of datasets.

  • Having learned the fundamentals of working with DataFrames, you will now move on to more advanced indexing techniques. These are powerful techniques that allow you to tidy and rearrange your data into the format that allows you to most easily analyze it for insights.

Learning Objective

After this lesson, you'll be able to

  • Understand the need for Pandas in Data Science
  • Data Manipulation and Transformations
  • Pivot Tables and Group By
  • Merging Data

Pre Reads

Slides

Check the Jupyter Notebook in the top right of the screen

Post Reads

Project

In IPL teams representing Indian cities contend each year. Chris Gayle is the highest run scorer in IPL. Do you know who is the second highest run scorer (without using ‘for’ loop)? This module can help you determine the second highest run scorer by manipulating large data sets to extract business insights.

This project challenges you to manipulate large datasets without using conventional programming techniques to extract business insights.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published