# Pandas Lesson

## Introduction

Pandas, short for "Python Data Analysis Library" or "Panel Data", is a library for data manipulation and analysis. It is built on top of `Numpy` and is designed to work with `Numpy` arrays. Pandas is a library that provides high-performance, easy-to-use data structures and data analysis tools. It is designed for quick and easy data manipulation, aggregation, and visualization.

Pandas is often used in conjunction with numerical computing tools like `Numpy` and `SciPy`, analytical libraries like `statsmodels` and `scikit-learn`, and data visualization libraries like `Matplotlib` and `Seaborn`. While pandas adopts many coding idioms from `Numpy`, the biggest difference is that pandas is designed for working with tabular or heterogeneous data. Numpy, by contrast, is best suited for working with homogeneous numerical array data.

Some of the key features of Pandas are:

- Fast and efficient DataFrame object with default and customized indexing.
- Tools for loading data into in-memory data objects from different file formats.
- Data alignment and integrated handling of missing data.
- Reshaping and pivoting of data sets.
- Label-based slicing, indexing and subsetting of large data sets.
- Group by data for aggregation and transformations.
- High performance merging and joining of data.
- Time series functionality.


You can install Pandas using `conda` or `pip`:

```bash
conda install pandas
```

```bash
pip install pandas
```

Then import it in your Python code:

In [1]:
import pandas as pd

where pd is a standard alias for pandas.

We usually import numpy in tandem with pandas:

In [2]:
import numpy as np

## Data Structures

Pandas has two main data structures:

- Series
- DataFrame

### Series

A series is a one-dimensional array-like object containing a sequence of values (of similar types to NumPy types) and an associated array of data labels, called its index.

The simplest Series is formed from only an array of data:

In [3]:
obj = pd.Series([4, 7, -5, 3])

obj

0    4
1    7
2   -5
3    3
dtype: int64