# 1. Introduction to Pandas

Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language.

## Key Features of Pandas

- **DataFrame Object:** Provides a tabular, spreadsheet-like data structure with labeled axes (rows and columns).
- **Data Alignment:** Intrinsic data alignment for handling data in DataFrame and Series objects.
- **Powerful Data Manipulation:** Includes functions for reshaping, pivoting, slicing, indexing, merging, and more.
- **Handling Missing Data:** Pandas can easily handle missing data or NA values.

## Basic Operations with Pandas

Here's how you can perform some basic operations with Pandas.

In [1]:
# Importing Pandas
import pandas as pd

# Creating a simple DataFrame
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Age': [28, 34, 29, 32]}
df = pd.DataFrame(data)

# Displaying the DataFrame
print(df)

    Name  Age
0   John   28
1   Anna   34
2  Peter   29
3  Linda   32


# 2. Understanding Pandas DataFrame

A <ins>DataFrame</ins> is a two-dimensional, size-mutable, and potentially *heterogeneous* tabular data structure with labeled axes (rows and columns) in Pandas. It is one of the most commonly used Pandas objects for data manipulation and analysis.

## Key Features of DataFrame

- **Heterogeneous data types:** Can store columns of different types (e.g., integer, string, float).
- **Size mutability:** Can change size and perform automatic data alignment.
- **Data manipulation:** Supports an extensive range of operations such as slicing, reshaping, merging, grouping, and more.

## Creating a DataFrame

You can create a DataFrame from various inputs like lists, dicts, series, and even another DataFrame.
### Example of Creating a DataFrame

```python
import pandas as pd

data = {
    'Column1': [1, 2, 3, 4],
    'Column2': ['a', 'b', 'c', 'd']
}

df = pd.DataFrame(data)
```

This creates a DataFrame with data organized in rows and columns.

In [7]:
# Creating a DataFrame
import pandas as pd

data = {
    'Name': ['John', 'Anna', 'Peter', 'Linda'],
    #'Age': [28, 34, 32, ''], #-> empty 
    'Age': [28, 34, 32, 24],
    'Score': ['A', 'B', 'C', 'D'],
    'Hobby': ['Golf', 'Reading', 'Soccer', 'Chatting']
}

df = pd.DataFrame(data)
print(df)

    Name  Age Score     Hobby
0   John   28     A      Golf
1   Anna   34     B   Reading
2  Peter   32     C    Soccer
3  Linda   24     D  Chatting


## Accessing Data in DataFrame

DataFrames allow for accessing data in various ways, like selecting columns or rows, filtering data, and applying functions.
### Example of Accessing Data



In [8]:
# Accessing a specific column
print(df['Name'])

0     John
1     Anna
2    Peter
3    Linda
Name: Name, dtype: object


In [9]:
# Accessing a specific row
print(df.iloc[3])

Name        Linda
Age            24
Score           D
Hobby    Chatting
Name: 3, dtype: object


In [27]:
print(df.iloc[1:3, 0:2])

    Name  Age
1   Anna   34
2  Peter   29


In [16]:
print(df[1:3])

    Name  Age Score    Hobby
1   Anna   34     B  Reading
2  Peter   32     C   Soccer


In [23]:
#slice by column only
print(df.iloc[:,0:2])

    Name  Age
0   John   28
1   Anna   34
2  Peter   32
3  Linda   24


## Reading and Writing Data

Pandas provides functionalities to read from and write to a variety of file formats, including CSV, Excel, JSON, and more.
### Example of Reading a CSV File

```python
df = pd.read_csv('file.csv')
```

### Example of Writing to an Excel File

```python
df.to_excel('file.xlsx', sheet_name='Sheet1')
```

# 3. Reading CSV Files with Pandas

Pandas is an excellent tool for data analysis and manipulation in Python, particularly useful for reading and processing structured data. One of the most common tasks in data analysis is reading CSV files. Let's look at how to use Pandas for this purpose.

## The `read_csv` Function

Pandas provides the `read_csv` function, which allows you to quickly read and parse a CSV file into a DataFrame. This function is highly customizable with numerous parameters to handle different types of CSV formats.

### Basic Usage

```python
import pandas as pd

df = pd.read_csv('path_to_file.csv')
```

This will read the CSV file located at 'path_to_file.csv' into a Pandas DataFrame.

### Handling Different Delimiters and Headers

If your CSV file uses a delimiter other than a comma, or if you need to handle headers in a specific way, you can specify these using additional parameters:

```python
# Using a different delimiter
df = pd.read_csv('path_to_file.csv', delimiter=';')

# No header in the CSV file
df = pd.read_csv('path_to_file.csv', header=None)
```

These examples show how to specify a different delimiter and how to handle files without a header row.

In [24]:
# Example of reading a CSV file
import pandas as pd

# Assuming 'example.csv' is a CSV file in the current directory
df = pd.read_csv('scores.csv')
df.head()  # Displaying the first few rows of the DataFrame

FileNotFoundError: [Errno 2] No such file or directory: 'scores.csv'

In [3]:
df.describe()

Unnamed: 0,Python,Sql,ML,Tableau,Excel
count,200.0,200.0,200.0,200.0,200.0
mean,0.5141,0.49585,0.51435,0.49515,0.47495
std,0.305749,0.290694,0.285211,0.292463,0.281686
min,0.0,0.01,0.0,0.01,0.0
25%,0.2375,0.2275,0.2675,0.24,0.2275
50%,0.545,0.49,0.54,0.5,0.485
75%,0.8,0.74,0.77,0.74,0.7025
max,1.0,1.0,1.0,1.0,0.99
