Skip to content

identity-fraud/9CT2_big-mac

Repository files navigation

Pandas

Pandas is a powerful data manipulation and analysis library for Python. It provides data structures and functions designed to make working with structured data (such as tabular, time series, or matrix data) fast, easy, and expressive.

Key Features of Pandas:

  • Data Structures: Pandas introduces two main data structures: Series (1-dimensional) and DataFrame (2-dimensional). These structures are built on top of NumPy arrays, providing additional functionality for data analysis.

  • Data Cleaning and Preparation: Pandas offers a wide range of tools for cleaning and preparing data. This includes handling missing data, reshaping data, merging and joining datasets, and more.

  • Data Selection and Indexing: Pandas provides powerful methods for selecting, indexing, and slicing data. This includes selecting specific rows or columns, boolean indexing, and hierarchical indexing (MultiIndex).

  • Data Aggregation and Grouping: Pandas allows for efficient data aggregation and grouping operations. This is useful for computing summary statistics, applying functions to grouped data, and performing complex transformations.

  • Time Series Analysis: Pandas has robust support for working with time series data. It includes functionality for date/time indexing, resampling, time zone handling, and date range generation.

  • Input/Output Tools: Pandas supports reading and writing data from various file formats, including CSV, Excel, JSON, SQL databases, and more. This makes it easy to work with data from different sources.

How to Use Pandas:

To use Pandas, you first need to install it using pip:

pip install pandas

Once installed, you can import Pandas into your Python script or Jupyter Notebook:

import pandas as pd

From there, you can create Series and DataFrame objects, load data from files, perform data manipulation and analysis operations, and visualize your data using tools like Matplotlib or Seaborn.

Pandas is widely used in data science, machine learning, finance, and other domains for data exploration, preprocessing, and analysis tasks. It's a versatile and essential tool in the Python ecosystem for working with structured data effectively.

Your Activities

Complete in the following order:

  1. BigMac.ipynb - Learn the basics of manipulating data with pandas

  2. BigMacReadWriteCSV.ipynb - Learn how to read from and write to .csv files using pandas

  3. BigMacMegaCSV.ipynb - Learn how to work with larger data sets using pandas

  4. VisualisingBigMacData.ipynb - Learn how to visualise datasets using pandas and matplotlib

  5. BigMacDataOmega.ipynb - Complete this activity using all you have learned. This will prepare you for your assessment task.

TO ASSIST YOU

These were developed to apply what you can learn from the Pandas tutorials from the other repository. If you go into the even-more-python-for-beginners-data-tools from our other repository, you can follow along to learn each concept as you complete the Big Mac Activities.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published