## Data Structures

- Pandas provides two primary data structures for handling data:

1. Series: A one-dimensional labeled array, similar to a list or NumPy array, but with labeled indices.
2. DataFrame: A two-dimensional labeled data structure, similar to a table or a spreadsheet, with rows and columns.

## 1. Pandas Series:

A Series is essentially a single column of data. It can hold data of any type (integer, float, string, etc.) and comes with an index that labels each data element.


-> Creating a Pandas Series:

In [1]:
import pandas as pd

# Creating a Series from a list
series1 = pd.Series([10, 20, 30, 40, 50])

# Printing the Series
print(series1)

0    10
1    20
2    30
3    40
4    50
dtype: int64


- The left column (0, 1, 2, ...) represents the index (default integers).
- The right column represents the values in the Series.

-> Creating a Series with Custom Index:

In [3]:
# Creating a Series with custom index
series2 = pd.Series([10, 20, 30, 40, 50], index=['a', 'b', 'c', 'd', 'e'])

# Printing the Series
print(series2)

a    10
b    20
c    30
d    40
e    50
dtype: int64


-> Accessing Data in Series:

In [4]:
# Accessing data by index
print(series2['b'])  # Output: 20

20


## 2. Pandas DataFrame:

A DataFrame is a two-dimensional labeled data structure with columns and rows. It can be thought of as a table, where each column can have different types of data (e.g., numeric, string, etc.).

In [5]:
# Creating a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)

# Printing the DataFrame
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston


- The rows (0, 1, 2, 3) represent the index (default integers).
- The columns ('Name', 'Age', 'City') represent different attributes.
- The values in the DataFrame are organized in rows and columns.

-> Creating a DataFrame with Custom Index:

In [6]:
# Creating a DataFrame with a custom index
df_custom_index = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])

# Printing the DataFrame
print(df_custom_index)

      Name  Age         City
a    Alice   25     New York
b      Bob   30  Los Angeles
c  Charlie   35      Chicago
d    David   40      Houston


-> Accessing Data in DataFrame:

In [7]:
# Accessing a column
print(df['Name'])  # Output: Series of 'Name' column

# Accessing a row by index
print(df.loc[1])   # Output: Row at index 1

# Accessing multiple rows and columns
print(df.loc[0:2, ['Name', 'City']])  # Rows 0, 1, 2 and columns 'Name' and 'City'

0      Alice
1        Bob
2    Charlie
3      David
Name: Name, dtype: object
Name            Bob
Age              30
City    Los Angeles
Name: 1, dtype: object
      Name         City
0    Alice     New York
1      Bob  Los Angeles
2  Charlie      Chicago


In [9]:
# print df in a prettier way 
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago
3,David,40,Houston


## Load a DataFrame from a CSV File

In [22]:
import pandas as pd

# Load DataFrame from a CSV file
df = pd.read_csv('pokemon_data.csv')

# Show the first few rows of the DataFrame
df

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,80,160,60,170,130,80,6,True


## Load a DataFrame from a txt File

In [23]:
# This reads from .txt file with space as delimiter
import pandas as pd

# Load DataFrame from a space-delimited .txt file
df = pd.read_csv('pokemon_data.txt', delim_whitespace=True)

# Show the first few rows of the DataFrame
df

  df = pd.read_csv('pokemon_data.txt', delim_whitespace=True)


Unnamed: 0,#,Name,Type,1,Type.1,2,HP,Attack,Defense,Sp.,Atk,Sp..1,Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,FALSE,,,,
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,FALSE,,,,
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,FALSE,,,,
3,3,VenusaurMega,Venusaur,Grass,Poison,80,100,123,122,120,80,1,FALSE,,,
4,4,Charmander,Fire,39,52,43,60,50,65,1,FALSE,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,50,100,150,100,150,50,6,TRUE,,,,
796,719,DiancieMega,Diancie,Rock,Fairy,50,160,110,160,110,110,6,TRUE,,,
797,720,HoopaHoopa,Confined,Psychic,Ghost,80,110,60,150,130,70,6,TRUE,,,
798,720,HoopaHoopa,Unbound,Psychic,Dark,80,160,60,170,130,80,6,TRUE,,,


### Reading an Excel file
```df = pd.read_excel('data.xlsx')```
### Reading a JSON file
```df = pd.read_json('data.json')```