## Pandas

When working with tabular data, such as data stored in a spreadsheet or a database, pandas is the best tool to use. Pandas will help you explore, clean, and process your data. A data table in pandas is called a <b> DataFrame</b>. A DataFrame is defined by rows and columns.

![dataframe.png](attachment:dataframe.png)


To read or write this tabular data, you can use data from file formats such as csv, excel, json. You can also import data from these sources.

### Creating objects
You can also create a dataframe bypassing a list, NumPy array, a dictionary. A DataFrame can have different data types(float, str, boolean etc)

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
#dataframe from list
df = pd.Series([0, 1, 2, 'values', np.nan, 5.4])
df

In [None]:
#dataframe from array of first 12 dates of the year
df1 = pd.date_range('20210101', periods=12)
df1

### Viewing data


In [None]:
data = pd.read_csv('./sample.csv')

#info about your csv
data.info()


In [None]:
#first five rows 
data.head()

In [None]:
data.loc[2]

### Selecting specific columns from a df
Use square brackets [ ] with the column name you would like to select. Use a list of column names when selecting more than one column.

In [None]:
#selecting specific columns
one_column = data['type']
one_column.head()

In [None]:
#selcting two columns or more
two_columns = data[['cast', 'director']]
two_columns.head()

### Filtering specific rows
To select rows based on a conditional expression, use a condition inside the selection brackets. For example, if we would like an output with movies released in 2019

In [None]:
data['release_year'] == 2019

In [None]:
movies_2019 = data[data['release_year'].isin([2019])]
movies_2019.head()

In [None]:
#visualization(line graph)
data['release_year'].plot()

In [None]:
#visualization
data['release_year'].value_counts().plot(kind='bar')

In [None]:
#creating new columns
data['rating'] = 2
data.head(2)

In [None]:
#sorting values to reshape the layout of your dataframe

data.sort_values(by='country').head()

In [None]:
#manipulating textual data
data['director'].str.lower()

### Resources
1.[Pandas Documentation](https://pandas.pydata.org/pandas-docs/stable/index.html)