# <center>Introduction to Pandas</center>

![](https://pandas.pydata.org/_static/pandas_logo.png)


## Installation

Simply,
```
pip install pandas
```


## Reading data from a CSV file

You can read data from a CSV file using the ``read_csv`` function. By default, it assumes that the fields are comma-separated.

In [None]:
# import pandas

>The `imdb.csv` dataset contains Highest Rated IMDb "Top 1000" Titles.

In [None]:
# load imdb dataset as pandas dataframe

In [None]:
# show first 5 rows of imdb_df

>The `bikes.csv` dataset contains information about the number of bicycles that used certain bicycle lanes in Montreal in the year 2012.

In [None]:
# load bikes dataset as pandas dataframe

In [None]:
# show first 3 rows of bikes_df

## Selecting columns

When you read a CSV, you get a kind of object called a DataFrame, which is made up of rows and columns. You get columns out of a DataFrame the same way you get elements out of a dictionary.

In [None]:
# list columns of imdb_df

In [None]:
# what are the datatypes of values in columns

In [None]:
# list first 5 movie titles

In [None]:
# show only movie title and genre

## Understanding columns

On the inside, the type of a column is ``pd.Series`` and pandas Series are internally numpy arrays. If you add ``.values`` to the end of any Series, you'll get its internal **numpy array**.

In [None]:
# show the type of duration column

In [None]:
# show duration values of movies as numpy arrays

## Applying functions to columns

Use `.apply` function to apply any function to each element of a column.

In [None]:
# convert all the movie titles to uppercase

## Plotting a column

Use ``.plot()`` function!

In [None]:
# plot the bikers travelling to Berri1 over the year

In [None]:
# plot all the columns of bikes_df

## Value counts

Get count of unique values in a particular column/Series.

In [None]:
# what are the unique genre in imdb_df?

In [None]:
# plotting value counts of unique genres as a bar chart

In [None]:
# plotting value counts of unique genres as a pie chart

## Index

### DATAFRAME = COLUMNS + INDEX + ND DATA

### SERIES = INDEX + 1-D DATA

**Index** or (**row labels**) is one of the fundamental data structure of pandas. It can be thought of as an **immutable array** and an **ordered set**.

> Every row is uniquely identified by its index value.

In [None]:
# show index of bikes_df

In [None]:
# get row for date 2012-01-01

#### To get row by integer index:

Use ``.iloc[]`` for purely integer-location based indexing for selection by position.

In [None]:
# show 11th row of imdb_df using iloc

## Selecting rows where column has a particular value

In [None]:
# select only those movies where genre is adventure

In [None]:
# which genre has highest number of movies with star rating above 8 and duration more than 130 minutes?

## Adding a new column to DataFrame

In [None]:
# add a weekday column to bikes_df

## Deleting an existing column from DataFrame

In [None]:
# remove column 'Unnamed: 1' from bikes_df

## Deleting a row in DataFrame

In [None]:
# remove row no. 1 from bikes_df

## Group By

Any groupby operation involves one of the following operations on the original object. They are −

- Splitting the Object

- Applying a function

- Combining the results

In many situations, we split the data into sets and we apply some functionality on each subset. In the apply functionality, we can perform the following operations −

- **Aggregation** − computing a summary statistic

- **Transformation** − perform some group-specific operation

- **Filtration** − discarding the data with some condition

In [None]:
# group imdb_df by movie genres

In [None]:
# get crime movies group

In [None]:
# get mean of movie durations for each group

In [None]:
# change duration of all movies in a particular genre to mean duration of the group

In [None]:
# drop groups/genres that do not have average movie duration greater than 120.

In [None]:
# group weekday wise bikers count

In [None]:
# get weekday wise biker count

In [None]:
# plot weekday wise biker count for 'Berri1'

![](https://memegenerator.net/img/instances/500x/73988569/pythonpandas-is-easy-import-and-go.jpg)