## Core Data Structures in Pandas
Pandas is built on two main data structures:

->Series → One-dimensional (like a single column in Excel) list with labels(index)

->DataFrame → Two-dimensional (like a full spreadsheet or SQL table). Like a dictionary of series

In [13]:
import pandas as pd
 
s = pd.Series([10, 20, 30, 40]) #default index 0,1,2.....

s2 = pd.Series([10, 20, 30, 40], index=["jew","los", "sj","ed"]) #custom index
print(s2)
print(s2["los"])
print(type(s2))



jew    10
los    20
sj     30
ed     40
dtype: int64
20
<class 'pandas.core.series.Series'>


## DataFrame
### From Dictionary of Lists

In [30]:
import pandas as pd

data = {
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Delhi", "Mumbai", "Bangalore"]
}
 
df = pd.DataFrame(data)
df
df.index
df.columns

Index(['name', 'age', 'city'], dtype='object')

### From Python Lists

In [33]:
 
data = [
    ["Alice", 25],
    ["Bob", 30],
    ["Charlie", 35]
]
 
df = pd.DataFrame(data, columns=["Name", "Age"],index=["one","two","three"])
print(df)

          Name  Age
one      Alice   25
two        Bob   30
three  Charlie   35


### From NumPy Arrays

In [36]:
import numpy as np
 
arr = np.array([[1, 2], [3, 4]])
df = pd.DataFrame(arr, columns=["A", "B"])
df


Unnamed: 0,A,B
0,1,2
1,3,4


### From CSV and json Files and web

In [15]:
# df = pd.read_csv("1745500125912-data.csv")
# df = pd.read_excel("data.xlsx")
# df = pd.read_json("data.json")
# df

url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv"
df = pd.read_csv(url)
df

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.50,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3
240,27.18,2.00,Female,Yes,Sat,Dinner,2
241,22.67,2.00,Male,Yes,Sat,Dinner,2
242,17.82,1.75,Male,No,Sat,Dinner,2


## EDA (Exploratory Data Analysis)

In [29]:
df.head()         # First 5 rows
df.tail()         # Last 5 rows
df.info()         # Column info: types, non-nulls
df.describe()     # Stats for numeric columns
df.columns        # List of column names
# df.shape          # (rows, columns)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   total_bill  244 non-null    float64
 1   tip         244 non-null    float64
 2   sex         244 non-null    object 
 3   smoker      244 non-null    object 
 4   day         244 non-null    object 
 5   time        244 non-null    object 
 6   size        244 non-null    int64  
dtypes: float64(2), int64(1), object(4)
memory usage: 13.5+ KB


Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')