# Pandas

``Pandas is a very powerful python library used for data analysis``. pandas is a Python library for data manipulation, cleaning, exploration, and analysis. 
It is built on top of NumPy and designed for working with tabular data like spreadsheets or SQL tables.

Pandas allows us to analyze big data and make conclusions based on statistical theories. Pandas can clean messy data sets, and make them readable and relevant. Pandas includes datastructures internally. 

``Data analyst - will convert the raw data to clean data``. Raw data means which has missing, without missing values are called clean data


 # Features of pandas



1.``Easy Handling of Missing Data``: Functions like isnull(), fillna(), and dropna() simplify working with incomplete datasets

2.``Labeled Axes (Rows & Columns)``: DataFrames and Series have labeled rows and columns for intuitive data access.

3.``Flexible Data Structures``: Supports Series (1D) and DataFrame (2D), built on top of NumPy.

4.``Powerful Data Input/Output Tools``: Read/write support for CSV, Excel, JSON, SQL, Parquet, clipboard, etc.

5.``Efficient Data Selection & Filtering``: Use .loc[], .iloc[], boolean indexing, and powerful query methods.

6.``Built-in Data Aggregation & Summarization``: Functions like sum(), mean(), describe(), groupby(), etc., help analyze data quickly.

7.``Data Alignment and Integration``: Automatic alignment of data for operations on different indexes.

8.``Comprehensive Data Cleaning Tools``: Replace values, handle dtypes, drop duplicates, format conversion, etc.

9.``Robust Time Series Support``: Includes tools for date ranges, frequency conversion, resampling, and time-shifting.

10.``Data Transformation Tools``: Supports apply(), map(), replace(), applymap(), and custom functions.

11.``Data Merging and Joining``: Join, merge, concatenate, and reshape datasets easily using merge(), join(), concat().

12.``Group-wise Operations``: Use groupby() to perform operations like sum, mean, and custom aggregations on grouped data.

13.``Built-in Data Visualization``: Integrated with Matplotlib for quick visual analysis using .plot() and related functions.

14.``High Performance``: Built on NumPy, optimized for performance and memory usage.



# Pandas divided into three main categories:

``1. Basic``
This section covers foundational knowledge in pandas. Ideal for beginners.

Topics:
Introduction: Overview of pandas, its purpose, and core components.

Getting Started: Installing pandas, importing it (import pandas as pd), basic usage.

Pandas Series: One-dimensional labeled array (like a column).

DataFrames: Two-dimensional table (rows and columns), the primary structure in pandas.

Read CSV: pd.read_csv() to load comma-separated data.

Read JSON: pd.read_json() for loading structured JSON files.

Analyze Data: Basic methods to explore data – df.head(), df.info(), df.describe(), etc.

``2. Cleaning Data``
Data cleaning is essential in real-world data science tasks, where raw data often contains inconsistencies.

Topics:
Clean Data: General overview of techniques like handling nulls, fixing types, etc.

Clean Empty Cells: Using dropna(), fillna() to manage missing data.

Clean Wrong Format: Changing data types using astype(), to_datetime(), etc.

Clean Wrong Data: Fix incorrect values manually or programmatically using replace() or logical indexing.

Remove Duplicates: Using drop_duplicates() to clean repeated records.

``3. Advanced``
These are higher-level operations once you're comfortable with the basics.

Topics:
Correlations: Using df.corr() to examine relationships between variables.

Plotting: Visualizing data with df.plot() or libraries like Matplotlib and Seaborn.

| Level        | Focus Area            | Goal                      |
| ------------ | --------------------- | ------------------------- |
| **Basic**    | Data structures & I/O | Understand and load data  |
| **Cleaning** | Data wrangling        | Prepare and fix the data  |
| **Advanced** | Analysis & viz        | Derive insights from data |


# List of Important Pandas Functions
Functions of pandas in python that you need to know for Data Analysis as well as Data Science and you will be able to use these functions to load, clean, transform, and analyze data.


Here are the list of some of the most important Pandas functions:

| Function                  | Description |
|---------------------------|-------------|
| `read_csv()`              | Retrieve data from CSV files into a DataFrame. |
| `head()`                  | Return the top n (default 5) rows. |
| `tail()`                  | Return the bottom n (default 5) rows. |
| `sample()`                | Generate a random sample from DataFrame. |
| `info()`                  | Summary of DataFrame structure. |
| `dtypes()`                | Return data types of columns. |
| `shape()`                 | Return (rows, columns) of DataFrame. |
| `size()`                  | Return total number of elements. |
| `ndim()`                  | Return number of dimensions (1 or 2). |
| `describe()`              | Return descriptive statistics. |
| `unique()`                | Return unique values in a column. |
| `nunique()`               | Return number of unique values. |
| `isnull()`                | Identify missing values (NaN). |
| `isna()`                  | Alias for `isnull()`. |
| `fillna()`                | Fill missing values. |
| `clip()`                  | Trim values at thresholds. |
| `columns`                | Return column labels. |
| `sort_values()`           | Sort DataFrame by column values. |
| `value_counts()`          | Count unique values in a Series. |
| `nlargest()`              | Return n largest values. |
| `nsmallest()`             | Return n smallest values. |
| `copy()`                  | Make a copy of the DataFrame. |
| `loc[]`                   | Access rows/columns by labels. |
| `iloc[]`                  | Access rows/columns by index. |
| `rename()`                | Rename rows/columns. |
| `where()`                 | Return rows meeting condition. |
| `drop()`                  | Drop rows/columns. |
| `groupby()`               | Group data by one/more columns. |
| `corr()`                  | Compute pairwise correlation. |
| `query()`                 | Filter DataFrame with query string. |
| `insert()`                | Insert column at specific position. |
| `sum()`                   | Compute sum over axis. |
| `mean()`                  | Compute mean over axis. |
| `median()`                | Compute median over axis. |
| `std()`                   | Compute standard deviation. |
| `apply()`                 | Apply function to axis/columns. |
| `merge()`                 | Merge two DataFrames. |
| `astype()`                | Convert data type. |
| `set_index()`             | Set column(s) as index. |
| `reset_index()`           | Reset index to default. |
| `at[]`                    | Access single value by label. |
| `iterrows()`              | Iterate over rows as (index, Series). |
| `iteritems()`             | Iterate over columns. |
| `to_datetime()`           | Convert to datetime. |
| `to_numeric()`            | Convert to numeric type. |
| `to_string()`             | Render DataFrame to string. |
| `concat()`                | Concatenate DataFrames. |
| `cov()`                   | Compute covariance. |
| `duplicated()`            | Identify duplicate rows. |
| `drop_duplicates()`       | Remove duplicate rows. |
| `dropna()`                | Remove missing data. |
| `diff()`                  | First discrete difference. |
| `rank()`                  | Compute ranks. |
| `mask()`                  | Replace values where condition is True. |
| `resample()`              | Resample time series data. |
| `transform()`             | Apply function and return DataFrame. |
| `replace()`               | Replace values. |
| `to_csv()`                | Export to CSV file. |
| `to_excel()`              | Export to Excel file. |
| `to_sql()`                | Write DataFrame to SQL table. |
| `plot()`                  | Plot data (requires matplotlib). |



# Raw data vs Clean data







The data which contains missing values called raw data with out missing values called clean data.

Data analyst will convert the raw data to clean data. To convert raw data to clean data we use pandas library for data cleaning or data cleansing 

# Raw Data
                                                                                
| Name | Age | Gender | Date of Birth | Marks |                               
| ---- | --- | ------ | ------------- | ----- |
| John | 15  |        | 2008-05-12    | 78    |
| Anna |     | Female | 12-06-2008    | 82    |
| Ravi | 14  | Male   | 2008-03-10    |       |


# Clean Data

| Name | Age | Gender | Date of Birth | Marks |
| ---- | --- | ------ | ------------- | ----- |
| John | 15  | Male   | 2008-05-12    | 78    |
| Anna | 15  | Female | 2008-06-12    | 82    |
| Ravi | 14  | Male   | 2008-03-10    | 80    |


# About Dataset

`` We use Superstore dataset to analyse``

| **Column Name**    | **Description**                                                                                  |
| ------------------ | ------------------------------------------------------------------------------------------------ |
| **Category**       | The general category of the product (e.g., Furniture, Office Supplies, Technology).              |
| **City**           | The name of the city where the order was placed or shipped.                                      |
| **Country/Region** | The country or region for the customer. Often this is just "United States" in standard datasets. |
| **Customer Name**  | The full name of the customer who placed the order.                                              |
| **Manufacturer**   | The company that made the product.                                                               |
| **Order Date**     | The date when the order was placed.                                                              |
| **Order ID**       | A unique identifier for each order.                                                              |
| **Postal Code**    | The ZIP or postal code of the customer or shipping location.                                     |
| **Product Name**   | The full name of the product sold.                                                               |
| **Region**         | The larger sales region (e.g., East, West, Central, South).                                      |
| **Segment**        | Type of customer (e.g., Consumer, Corporate, Home Office).                                       |
| **Ship Date**      | The date when the order was shipped.                                                             |
| **Ship Mode**      | The shipping method used (e.g., Standard Class, First Class).                                    |
| **State/Province** | The state or province of the customer.                                                           |
| **Sub-Category**   | A more detailed category under "Category" (e.g., Chairs, Phones, Binders).                       |
| **Discount**       | The percentage discount applied to the order.                                                    |
| **Profit**         | The profit earned from the sale of the product.                                                  |
| **Quantity**       | Number of units of the product sold.                                                             |
| **Sales**          | Total sales value (price × quantity, minus discount).                                            |








In [45]:
import pandas as pd

In [47]:
store=pd.read_csv(r"C:\Users\91630\Desktop\FULL STACK DATASCIENCE AND GENAI\ClASSR00M\3. FEB ClASSR00M\12th FEB - Pandas\12th - Pandas\Sample - Superstore_Orders.csv")

In [49]:
store # imported pandas and created a varaiable to call the dataset

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.4870,2,3.540
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.8840,3,19.536
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders,0.2,19.7910,3,52.776
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders,0.2,6.4750,2,20.720
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.930


In [51]:
id(store)

1628971152064

In [53]:
len(store) # it displays the rows

10194

In [57]:
store.shape  # it display dimensionality i.e rows and column

(10194, 19)

In [59]:
store.columns  # it displays all columns in the dataset

Index(['Category', 'City', 'Country/Region', 'Customer Name', 'Manufacturer',
       'Order Date', 'Order ID', 'Postal Code', 'Product Name', 'Region',
       'Segment', 'Ship Date', 'Ship Mode', 'State/Province', 'Sub-Category',
       'Discount', 'Profit', 'Quantity', 'Sales'],
      dtype='object')

In [61]:
len(store.columns)

19

In [63]:
store.dtypes   # it displays the columns types wheather it is a numerical data or categorical data 14 are categorical, 4 are numerical

Category           object
City               object
Country/Region     object
Customer Name      object
Manufacturer       object
Order Date         object
Order ID           object
Postal Code        object
Product Name       object
Region             object
Segment            object
Ship Date          object
Ship Mode          object
State/Province     object
Sub-Category       object
Discount          float64
Profit            float64
Quantity            int64
Sales             float64
dtype: object

In [65]:
store.info()  # it gives complete information about the dataset it has missing values or not which type rows columns everything

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10194 entries, 0 to 10193
Data columns (total 19 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Category        10194 non-null  object 
 1   City            10194 non-null  object 
 2   Country/Region  10194 non-null  object 
 3   Customer Name   10194 non-null  object 
 4   Manufacturer    10194 non-null  object 
 5   Order Date      10194 non-null  object 
 6   Order ID        10194 non-null  object 
 7   Postal Code     10194 non-null  object 
 8   Product Name    10194 non-null  object 
 9   Region          10194 non-null  object 
 10  Segment         10194 non-null  object 
 11  Ship Date       10194 non-null  object 
 12  Ship Mode       10194 non-null  object 
 13  State/Province  10194 non-null  object 
 14  Sub-Category    10194 non-null  object 
 15  Discount        10194 non-null  float64
 16  Profit          10194 non-null  float64
 17  Quantity        10194 non-null 

In [67]:
store.isnull()  # if it has null values it displays True or it displays False

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10189,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10190,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10191,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10192,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [69]:
store.isnull().sum()  # it gives the count of null vales in every column

Category          0
City              0
Country/Region    0
Customer Name     0
Manufacturer      0
Order Date        0
Order ID          0
Postal Code       0
Product Name      0
Region            0
Segment           0
Ship Date         0
Ship Mode         0
State/Province    0
Sub-Category      0
Discount          0
Profit            0
Quantity          0
Sales             0
dtype: int64

In [71]:
store.head() # head means top rows defaultly it gives 5 tops rows

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.487,2,3.54
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.884,3,19.536


In [73]:
store.tail() # tail menas bottom rows i.e defaulty it gives 5 bottom rows

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders,0.2,19.791,3,52.776
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders,0.2,6.475,2,20.72
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.93
10193,Office Supplies,Charlottetown,Canada,Harry Olson,Wilson Jones,30-12-2023,CA-2023-143500,C0A,Wilson Jones Impact Binders,East,Consumer,03-01-2024,Standard Class,Prince Edward Island,Binders,0.2,-0.6048,3,3.024


In [75]:
store.head(3) # it prints top 3 rows

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.487,2,3.54
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784


In [77]:
store.tail(3) # it prints bottom 3 rows

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.93
10193,Office Supplies,Charlottetown,Canada,Harry Olson,Wilson Jones,30-12-2023,CA-2023-143500,C0A,Wilson Jones Impact Binders,East,Consumer,03-01-2024,Standard Class,Prince Edward Island,Binders,0.2,-0.6048,3,3.024


In [79]:
store.describe() # describe means descriptive statistics i.e all statistical functions are apllyed on numerical data in dataset

Unnamed: 0,Discount,Profit,Quantity,Sales
count,10194.0,10194.0,10194.0,10194.0
mean,0.155385,28.673417,3.791838,228.225854
std,0.206249,232.465115,2.228317,619.906839
min,0.0,-6599.978,1.0,0.444
25%,0.0,1.7608,2.0,17.22
50%,0.2,8.69,3.0,53.91
75%,0.2,29.297925,5.0,209.5
max,0.8,8399.976,14.0,22638.48


# Slicing in Pandas

In [82]:
store[:] 

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.4870,2,3.540
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.8840,3,19.536
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders,0.2,19.7910,3,52.776
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders,0.2,6.4750,2,20.720
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.930


In [84]:
store[2:10] # from 2 row to 9 th row

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.884,3,19.536
5,Furniture,Henderson,United States,Maria Etezadi,Global,06-01-2020,US-2020-167199,42420,Global Deluxe High-Back Manager's Chair,South,Home Office,10-01-2020,Standard Class,Kentucky,Chairs,0.0,746.4078,9,2573.82
6,Office Supplies,Henderson,United States,Maria Etezadi,Rogers,06-01-2020,US-2020-167199,42420,Rogers Handheld Barrel Pencil Sharpener,South,Home Office,10-01-2020,Standard Class,Kentucky,Art,0.0,1.4796,2,5.48
7,Office Supplies,Athens,United States,Jack O'Briant,Dixon,06-01-2020,US-2020-106054,30605,"Dixon Prang Watercolor Pencils, 10-Color Set w...",South,Corporate,07-01-2020,First Class,Georgia,Art,0.0,5.2398,3,12.78
8,Office Supplies,Henderson,United States,Maria Etezadi,Ibico,06-01-2020,US-2020-167199,42420,Ibico Hi-Tech Manual Binding System,South,Home Office,10-01-2020,Standard Class,Kentucky,Binders,0.0,274.491,2,609.98
9,Office Supplies,Henderson,United States,Maria Etezadi,Alliance,06-01-2020,US-2020-167199,42420,"Alliance Super-Size Bands, Assorted Sizes",South,Home Office,10-01-2020,Standard Class,Kentucky,Fasteners,0.0,0.3112,4,31.12


In [86]:
store[10:] # from 10 row to till last

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
10,Office Supplies,Henderson,United States,Maria Etezadi,Southworth,06-01-2020,US-2020-167199,42420,Southworth 25% Cotton Granite Paper & Envelopes,South,Home Office,10-01-2020,Standard Class,Kentucky,Paper,0.0,3.0084,1,6.540
11,Office Supplies,Los Angeles,United States,Lycoris Saunders,Xerox,06-01-2020,US-2020-130813,90049,Xerox 225,West,Consumer,08-01-2020,Second Class,California,Paper,0.0,9.3312,3,19.440
12,Technology,Henderson,United States,Maria Etezadi,Other,06-01-2020,US-2020-167199,42420,Wireless Extenders zBoost YX545 SOHO Signal Bo...,South,Home Office,10-01-2020,Standard Class,Kentucky,Phones,0.0,204.1092,4,755.960
13,Technology,Henderson,United States,Maria Etezadi,GE,06-01-2020,US-2020-167199,42420,GE 30524EE4,South,Home Office,10-01-2020,Standard Class,Kentucky,Phones,0.0,113.6742,2,391.980
14,Furniture,Huntsville,United States,Vivek Sundaresam,Howard Miller,07-01-2020,US-2020-105417,77340,"Howard Miller 14-1/2"" Diameter Chrome Round Wa...",Central,Consumer,12-01-2020,Standard Class,Texas,Furnishings,0.6,-53.7096,3,76.728
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders,0.2,19.7910,3,52.776
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders,0.2,6.4750,2,20.720
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.930


In [88]:
store[:20] # starting to till 19 row

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.487,2,3.54
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.884,3,19.536
5,Furniture,Henderson,United States,Maria Etezadi,Global,06-01-2020,US-2020-167199,42420,Global Deluxe High-Back Manager's Chair,South,Home Office,10-01-2020,Standard Class,Kentucky,Chairs,0.0,746.4078,9,2573.82
6,Office Supplies,Henderson,United States,Maria Etezadi,Rogers,06-01-2020,US-2020-167199,42420,Rogers Handheld Barrel Pencil Sharpener,South,Home Office,10-01-2020,Standard Class,Kentucky,Art,0.0,1.4796,2,5.48
7,Office Supplies,Athens,United States,Jack O'Briant,Dixon,06-01-2020,US-2020-106054,30605,"Dixon Prang Watercolor Pencils, 10-Color Set w...",South,Corporate,07-01-2020,First Class,Georgia,Art,0.0,5.2398,3,12.78
8,Office Supplies,Henderson,United States,Maria Etezadi,Ibico,06-01-2020,US-2020-167199,42420,Ibico Hi-Tech Manual Binding System,South,Home Office,10-01-2020,Standard Class,Kentucky,Binders,0.0,274.491,2,609.98
9,Office Supplies,Henderson,United States,Maria Etezadi,Alliance,06-01-2020,US-2020-167199,42420,"Alliance Super-Size Bands, Assorted Sizes",South,Home Office,10-01-2020,Standard Class,Kentucky,Fasteners,0.0,0.3112,4,31.12


In [90]:
store[: :-1] # it prints reverse of the dataset from last to first

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
10193,Office Supplies,Charlottetown,Canada,Harry Olson,Wilson Jones,30-12-2023,CA-2023-143500,C0A,Wilson Jones Impact Binders,East,Consumer,03-01-2024,Standard Class,Prince Edward Island,Binders,0.2,-0.6048,3,3.024
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones,0.0,2.7279,7,90.930
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners,0.2,-0.6048,3,3.024
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders,0.2,6.4750,2,20.720
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders,0.2,19.7910,3,52.776
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art,0.2,4.8840,3,19.536
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage,0.2,-64.7748,3,272.736
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels,0.2,4.2717,3,11.784
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders,0.8,-5.4870,2,3.540


In [92]:
store[ : :10] # it prints dataset with 10 step count

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper,0.2,5.5512,2,16.448
10,Office Supplies,Henderson,United States,Maria Etezadi,Southworth,06-01-2020,US-2020-167199,42420,Southworth 25% Cotton Granite Paper & Envelopes,South,Home Office,10-01-2020,Standard Class,Kentucky,Paper,0.0,3.0084,1,6.540
20,Furniture,Dover,United States,Seth Vernon,DAX,11-01-2020,US-2020-130092,19901,"DAX Value U-Channel Document Frames, Easel Back",East,Consumer,14-01-2020,First Class,Delaware,Furnishings,0.0,3.0814,2,9.940
30,Office Supplies,San Francisco,United States,Brian Dahlen,Tennsco,13-01-2020,US-2020-157147,94109,Tennsco 6- and 18-Compartment Lockers,West,Consumer,18-01-2020,Standard Class,California,Storage,0.0,238.6530,5,1325.850
40,Office Supplies,Scottsdale,United States,Toby Swindell,GBC,19-01-2020,US-2020-146591,85254,"GBC Standard Recycled Report Covers, Clear Pla...",West,Consumer,20-01-2020,First Class,Arizona,Binders,0.7,-23.7160,10,32.340
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10150,Office Supplies,Newark,United States,Dan Reichenbach,BIC,27-12-2023,US-2023-134404,43055,BIC Brite Liner Highlighters,East,Corporate,27-12-2023,Same Day,Ohio,Art,0.2,3.6432,4,13.248
10160,Office Supplies,New York City,United States,Jennifer Ferguson,Storex,28-12-2023,US-2023-164826,10024,Storex Dura Pro Binders,East,Consumer,04-01-2024,Standard Class,New York,Binders,0.2,11.2266,7,33.264
10170,Technology,New York City,United States,Jennifer Ferguson,Other,28-12-2023,US-2023-164826,10024,Cush Cases Heavy Duty Rugged Cover Case for Sa...,East,Consumer,04-01-2024,Standard Class,New York,Phones,0.0,4.0095,3,14.850
10180,Office Supplies,New York City,United States,Michael Chen,Other,29-12-2023,US-2023-102638,10035,Ideal Clamps,East,Consumer,31-12-2023,First Class,New York,Fasteners,0.0,2.9547,3,6.030


In [102]:
store[-1: :-10] # it prints reverse of dataset with step count of 10 

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category,Discount,Profit,Quantity,Sales
10193,Office Supplies,Charlottetown,Canada,Harry Olson,Wilson Jones,30-12-2023,CA-2023-143500,C0A,Wilson Jones Impact Binders,East,Consumer,03-01-2024,Standard Class,Prince Edward Island,Binders,0.2,-0.6048,3,3.024
10183,Furniture,St. John's,Canada,James Peterman,Nu-Dell,29-12-2023,CA-2023-146626,A0A,Nu-Dell Executive Frame,East,Corporate,05-01-2024,Standard Class,Newfoundland and Labrador,Furnishings,0.0,35.4144,8,99.120
10173,Furniture,Los Angeles,United States,James Galang,Global,29-12-2023,US-2023-118885,90049,"Global High-Back Leather Tilter, Burgundy",West,Consumer,02-01-2024,Standard Class,California,Chairs,0.2,-44.2764,4,393.568
10163,Office Supplies,Fargo,United States,Christopher Schild,Wilson Jones,28-12-2023,US-2023-135111,58103,Wilson Jones Impact Binders,Central,Home Office,02-01-2024,Standard Class,North Dakota,Binders,0.0,12.6910,5,25.900
10153,Furniture,Long Beach,United States,Jason Gross,Novimex,28-12-2023,US-2023-101322,90805,Novimex Turbo Task Chair,West,Corporate,31-12-2023,First Class,California,Chairs,0.2,-34.0704,6,340.704
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
43,Furniture,Jonesboro,United States,Hunter Lopez,Hon,20-01-2020,US-2020-147627,72401,Hon 4700 Series Mobuis Mid-Back Task Chairs wi...,South,Consumer,26-01-2020,Standard Class,Arkansas,Chairs,0.0,224.2674,3,1067.940
33,Technology,Roswell,United States,Erica Hackney,Logitech,15-01-2020,US-2020-103366,30076,Logitech 910-002974 M325 Wireless Mouse for We...,South,Consumer,17-01-2020,First Class,Georgia,Accessories,0.0,65.9780,5,149.950
23,Office Supplies,San Francisco,United States,Brian Dahlen,Other,13-01-2020,US-2020-157147,94109,4009 Highlighters by Sanford,West,Consumer,18-01-2020,Standard Class,California,Art,0.0,6.5670,5,19.900
13,Technology,Henderson,United States,Maria Etezadi,GE,06-01-2020,US-2020-167199,42420,GE 30524EE4,South,Home Office,10-01-2020,Standard Class,Kentucky,Phones,0.0,113.6742,2,391.980


In [128]:
store['Category'] # to print single column

0        Office Supplies
1        Office Supplies
2        Office Supplies
3        Office Supplies
4        Office Supplies
              ...       
10189    Office Supplies
10190    Office Supplies
10191    Office Supplies
10192         Technology
10193    Office Supplies
Name: Category, Length: 10194, dtype: object

In [114]:
store_data=store[['Category','City','Country/Region']]   # to print 1 or more columns
store_data

Unnamed: 0,Category,City,Country/Region
0,Office Supplies,Houston,United States
1,Office Supplies,Naperville,United States
2,Office Supplies,Naperville,United States
3,Office Supplies,Naperville,United States
4,Office Supplies,Philadelphia,United States
...,...,...,...
10189,Office Supplies,New York City,United States
10190,Office Supplies,Fairfield,United States
10191,Office Supplies,Loveland,United States
10192,Technology,New York City,United States


In [147]:
store.columns # it prints all columns names

Index(['Category', 'City', 'Country/Region', 'Customer Name', 'Manufacturer',
       'Order Date', 'Order ID', 'Postal Code', 'Product Name', 'Region',
       'Segment', 'Ship Date', 'Ship Mode', 'State/Province', 'Sub-Category',
       'Discount', 'Profit', 'Quantity', 'Sales'],
      dtype='object')

In [163]:
store_categorical_data=store[['Category','City','Country/Region','Customer Name','Manufacturer','Order Date','Order ID','Postal Code','Product Name','Region','Segment','Ship Date','Ship Mode','State/Province','Sub-Category']]
store_categorical_data   # prints all the categorical data

Unnamed: 0,Category,City,Country/Region,Customer Name,Manufacturer,Order Date,Order ID,Postal Code,Product Name,Region,Segment,Ship Date,Ship Mode,State/Province,Sub-Category
0,Office Supplies,Houston,United States,Darren Powers,Message Book,03-01-2020,US-2020-103800,77015,"Message Book, Wirebound, Four 5 1/2"" X 4"" Form...",Central,Consumer,07-01-2020,Standard Class,Texas,Paper
1,Office Supplies,Naperville,United States,Phillina Ober,GBC,04-01-2020,US-2020-112326,60540,GBC Standard Plastic Binding Systems Combs,Central,Home Office,08-01-2020,Standard Class,Illinois,Binders
2,Office Supplies,Naperville,United States,Phillina Ober,Avery,04-01-2020,US-2020-112326,60540,Avery 508,Central,Home Office,08-01-2020,Standard Class,Illinois,Labels
3,Office Supplies,Naperville,United States,Phillina Ober,SAFCO,04-01-2020,US-2020-112326,60540,SAFCO Boltless Steel Shelving,Central,Home Office,08-01-2020,Standard Class,Illinois,Storage
4,Office Supplies,Philadelphia,United States,Mick Brown,Avery,05-01-2020,US-2020-141817,19143,Avery Hi-Liter EverBold Pen Style Fluorescent ...,East,Consumer,12-01-2020,Standard Class,Pennsylvania,Art
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10189,Office Supplies,New York City,United States,Patrick O'Donnell,Wilson Jones,30-12-2023,US-2023-143259,10009,Wilson Jones Legal Size Ring Binders,East,Consumer,03-01-2024,Standard Class,New York,Binders
10190,Office Supplies,Fairfield,United States,Erica Bern,GBC,30-12-2023,US-2023-115427,94533,GBC Binding covers,West,Corporate,03-01-2024,Standard Class,California,Binders
10191,Office Supplies,Loveland,United States,Jill Matthias,Other,30-12-2023,US-2023-156720,80538,Bagged Rubber Bands,West,Consumer,03-01-2024,Standard Class,Colorado,Fasteners
10192,Technology,New York City,United States,Patrick O'Donnell,Other,30-12-2023,US-2023-143259,10009,Gear Head AU3700S Headset,East,Consumer,03-01-2024,Standard Class,New York,Phones


In [165]:
store_categorical_data.dtypes # all categorical datatype

Category          object
City              object
Country/Region    object
Customer Name     object
Manufacturer      object
Order Date        object
Order ID          object
Postal Code       object
Product Name      object
Region            object
Segment           object
Ship Date         object
Ship Mode         object
State/Province    object
Sub-Category      object
dtype: object

In [159]:
store_numerical_data=store[['Discount','Profit','Quantity','Sales']]
store_numerical_data   # prints all numerical data

Unnamed: 0,Discount,Profit,Quantity,Sales
0,0.2,5.5512,2,16.448
1,0.8,-5.4870,2,3.540
2,0.2,4.2717,3,11.784
3,0.2,-64.7748,3,272.736
4,0.2,4.8840,3,19.536
...,...,...,...,...
10189,0.2,19.7910,3,52.776
10190,0.2,6.4750,2,20.720
10191,0.2,-0.6048,3,3.024
10192,0.0,2.7279,7,90.930


In [161]:
store_numerical_data.dtypes # display all numerical datatypes

Discount    float64
Profit      float64
Quantity      int64
Sales       float64
dtype: object