# 01 â€“ Pandas Practice: Exploring DataFrames

## Exploring a DataFrame

Pandas provides several built-in methods to quickly understand the structure, content, and statistics of a DataFrame.

### Previewing Data
- **`df.head()`**  
  Displays the **first 5 rows** of the DataFrame.  
  Useful for quickly inspecting how the data is structured.

- **`df.tail()`**  
  Displays the **last 5 rows** of the DataFrame.  
  Helpful for checking how the dataset ends or verifying recent entries.

---

### Dataset Overview
- **`df.info()`**  
  Returns a concise summary of the DataFrame, including:
  - Index range (number of rows)
  - Column names
  - Non-null value counts
  - Data types of each column
  - Overall memory usage

---

### Descriptive Statistics
- **`df.describe()`**  
  Generates descriptive statistics for numerical columns, such as:
  - count
  - mean
  - standard deviation (std)
  - minimum and maximum values
  - 25%, 50% (median), and 75% percentiles
  - max 

---

### Columns and Index
- **`df.columns`**  
  Returns a list of all column names in the DataFrame.

- **`df.index`**  
  Displays the index information, including the start value, stop value, and step size.


In [None]:
import pandas as pd
df =  pd.read_csv('orders.csv')

In [None]:
print(df)

In [3]:
df.head()

Unnamed: 0,OrderID,CustomerName,Product,Category,Quantity,Price,OrderDate,Shipped,Country
0,1001,John Smith,Laptop,Electronics,1,1200.0,2024-06-01,Yes,USA
1,1002,Sarah Lee,Headphones,Electronics,2,150.0,2024-06-03,No,Canada
2,1003,Ali Khan,Office Chair,Furniture,1,300.0,2024-06-04,Yes,UAE
3,1004,Alice Wong,Desk Lamp,Furniture,3,45.0,2024-06-05,Yes,Singapore
4,1005,Carlos Mendez,Keyboard,Electronics,2,80.0,2024-06-06,No,Mexico


In [4]:
df.tail()

Unnamed: 0,OrderID,CustomerName,Product,Category,Quantity,Price,OrderDate,Shipped,Country
35,1036,Emma Thompson,Desk Lamp with USB,Furniture,1,68.0,2024-07-07,No,UK
36,1037,Carlos Santos,Wireless Earbuds,Electronics,1,125.0,2024-07-08,Yes,Portugal
37,1038,Leila Mansouri,Desk Pad,Furniture,1,28.0,2024-07-09,Yes,Iran
38,1039,Daniel Kim,Power Strip,Electronics,2,18.0,2024-07-10,No,South Korea
39,1040,Anna Ivanova,Desk Clock,Furniture,1,35.0,2024-07-11,Yes,Ukraine


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 40 entries, 0 to 39
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   OrderID       40 non-null     int64  
 1   CustomerName  40 non-null     object 
 2   Product       40 non-null     object 
 3   Category      40 non-null     object 
 4   Quantity      40 non-null     int64  
 5   Price         40 non-null     float64
 6   OrderDate     40 non-null     object 
 7   Shipped       40 non-null     object 
 8   Country       40 non-null     object 
dtypes: float64(1), int64(2), object(6)
memory usage: 2.9+ KB


In [6]:
df.describe()

Unnamed: 0,OrderID,Quantity,Price
count,40.0,40.0,40.0
mean,1020.5,5.45,106.4575
std,11.690452,15.903475,201.091854
min,1001.0,1.0,0.8
25%,1010.75,1.0,18.0
50%,1020.5,1.0,43.5
75%,1030.25,2.25,112.5
max,1040.0,100.0,1200.0


In [8]:
df.columns

Index(['OrderID', 'CustomerName', 'Product', 'Category', 'Quantity', 'Price',
       'OrderDate', 'Shipped', 'Country'],
      dtype='object')

In [None]:
df.index   #here df.index shows the range index of the dataframe, e.g. start = 0, stop = 40, step = 1

RangeIndex(start=0, stop=40, step=1)