# Exploring Pandas

   1.  Introduction to Pandas
   2.  Installing Pandas
   3.  Basic Pandas Data Structures
*         Series
*         DataFrame
   4.  Reading and Writing Data
*       CSV
*       Excel
*       SQL
   5.  Data Cleaning
*         Handling Missing Values
*         Dropping Columns/Rows
   6.  Basic Operations
*         Filtering
*         Sorting
*         Aggregating
   7.  Data Transformation
*         Merging
*         Joining
*         Reshaping
   8.  Grouping and Aggregation
   9.  Time-Series Data
   10. Visualization in Pandas

# 1. Introduction to Pandas

Pandas is a fast, powerful, and flexible open-source library for data analysis and manipulation in Python.

# 2. Installing Pandas

In [None]:
pip install pandas

In [None]:
import pandas as pd

# 3. Basic Pandas Data Structures

### Series

* A Series is a one-dimensional labeled array capable of holding any data type.

### Key Features:

    1. Labels: Known as index
    2. Data: Can be integer, float, string, etc.
    
    
 
 
### **Creating a Series:**
 You can create a series from a list, NumPy array, or dictionary.
   

In [None]:
# From a List
sl = pd.Series([1,2,3,4])

In [None]:
# From a Dictionary
sd = pd.Series({'a': 1, 'b':2, 'c':3})

# Operations:


### Indexing

In [None]:
sl[0]

In [None]:
sd['b']

### Slicing

In [None]:
sl[1:3]

In [None]:
sl.sum()

In [None]:
sd.sum()

### Custom Index

In [None]:
sci = pd.Series([1,2,3,4], index = ['x','y','z','w'])

In [None]:
sci['x']

In [None]:
sci[1:3]

In [None]:
sci.sum()

In [None]:
sci.mean()

## Useful Operations

In [None]:
sci.describe()

When you create a Pandas Series with custom index labels, you are essentially mapping the index labels to the data values. The index parameter allows you to specify custom labels instead of the default integer-based index.

## Code Explanation:

    sci = pd.Series([1, 2, 3, 4], index=['x', 'y', 'z', 'w'])

- [1, 2, 3, 4]: These are the data values that you want in the Series.
- index=['x', 'y', 'z', 'w']: These are the custom index labels that you specify.

  x    1
  y    2
  z    3
  w    4
  dtype: int64
  
  
  Key Points:

    Data Values: 1, 2, 3, 4
    Custom Index Labels: 'x', 'y', 'z', 'w'

### What Does This Mean?

    The value 1 is mapped to the index label x.
    The value 2 is mapped to the index label y.
    The value 3 is mapped to the index label z.
    The value 4 is mapped to the index label w

By using custom index labels, you can make your Series more readable and better aligned with the context of your data. You can also use these custom index labels for indexing, slicing, and other operations, much like you would with the default integer-based indices.

## Reading and Writing Data
1. CSV (Comma-Separated Values)

### Reading from CSV

To read a CSV file and convert it into a DataFrame, you use the read_csv() function.

Dataset source from [Kaggle](https://www.kaggle.com/datasets/nelgiriyewithana/top-spotify-songs-2023)

In [None]:
df_csv = pd.read_csv('spotify-2023.csv', encoding='ISO-8859-1')

If ISO-8859-1 does not work, and you are unsure of the file's encoding, you might use the chardet library to guess the encoding:

In [None]:
pip install chardet

In [None]:
import chardet

rawdata = open('spotify-2023.csv', 'rb').read()
result = chardet.detect(rawdata)
encoding = result['encoding']

df_csv = pd.read_csv('spotify-2023.csv', encoding=encoding)

In [None]:
df_csv.head()

In [None]:
df_csv.info()

In [None]:
df_csv.describe()