# Introduction to Data Handling: Numpy and Pandas

## Overview
In this notebook, we will explore two of the most powerful Python libraries for data manipulation and analysis: **Numpy** and **Pandas**. These libraries are widely used for data handling, and you'll use them often when working with datasets.

We'll cover the following topics:

- Introduction to Numpy
- Numpy arrays and operations
- Introduction to Pandas
- Pandas Series and DataFrames
- Data manipulation with Pandas

By the end of this notebook, you'll be able to handle arrays and tabular data effectively using Numpy and Pandas.

## 1. Introduction to Numpy

`Numpy` (Numerical Python) is a powerful library for numerical computations in Python. It provides support for working with multi-dimensional arrays, along with a variety of functions to perform mathematical and statistical operations on these arrays.

Let's start by creating a simple Numpy array and performing basic operations.

In [None]:
# Example: Creating a Numpy array
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
print('Numpy Array:', arr)

# Basic operations
print('Sum:', np.sum(arr))
print('Mean:', np.mean(arr))
print('Standard Deviation:', np.std(arr))

### 1.1. Multi-dimensional Arrays

Numpy allows us to create multi-dimensional arrays, which are often used in scientific computing, machine learning, and other data analysis tasks. A 2D array is like a matrix with rows and columns.

Let's create a 2D array and perform operations on it.

In [None]:
# Example: Creating a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print('2D Array:\n', arr_2d)

# Accessing elements
print('Element at (0,0):', arr_2d[0, 0])  # Element at first row, first column

# Slicing
print('First row:', arr_2d[0, :])  # First row
print('First column:', arr_2d[:, 0])  # First column

## 2. Introduction to Pandas

`Pandas` is a powerful library for data manipulation and analysis. It provides two primary data structures:

- **Series**: A one-dimensional array-like object that can hold any data type (integers, floats, strings, etc.).
- **DataFrame**: A two-dimensional, tabular data structure with labeled axes (rows and columns), similar to a spreadsheet or SQL table.

Let's start by creating a Pandas Series.

In [None]:
# Example: Creating a Pandas Series
import pandas as pd

data = [10, 20, 30, 40, 50]
series = pd.Series(data)
print('Pandas Series:\n', series)

# Accessing elements
print('Element at index 1:', series[1])

## 3. Working with Pandas DataFrames

The most commonly used data structure in Pandas is the **DataFrame**. A DataFrame allows you to store and manipulate tabular data in a structured way, similar to how data is stored in a spreadsheet.

Let's create a simple DataFrame and perform basic operations.

In [None]:
# Example: Creating a DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print('DataFrame:\n', df)

# Accessing columns
print('Name column:\n', df['Name'])

# Accessing rows
print('First row:\n', df.iloc[0])

### 3.1. Reading and Writing Data with Pandas

Pandas provides functions to easily read data from files like CSV (comma-separated values) and Excel spreadsheets, and to write data back to these formats.

Let's see how to read and write a CSV file using Pandas.

In [None]:
# Example: Reading and writing a CSV file
# Writing DataFrame to a CSV file
df.to_csv('data.csv', index=False)
print('DataFrame written to data.csv')

# Reading from a CSV file
df_read = pd.read_csv('data.csv')
print('DataFrame read from CSV:\n', df_read)

### 3.2. Data Manipulation with Pandas

Pandas allows you to easily manipulate data. You can filter data, add or remove columns, and perform aggregation operations (like calculating the mean, sum, or count of data).

Let's explore some data manipulation techniques with a Pandas DataFrame.

In [None]:
# Example: Filtering data
print('Rows where Age > 25:\n', df[df['Age'] > 25])

# Adding a new column
df['Salary'] = [50000, 60000, 70000]
print('DataFrame with new column:\n', df)

# Dropping a column
df_dropped = df.drop('Salary', axis=1)
print('DataFrame after dropping Salary column:\n', df_dropped)

## Conclusion

In this notebook, we explored how to use the Numpy and Pandas libraries for data handling. We learned how to create and manipulate arrays with Numpy, and how to work with Series and DataFrames using Pandas.

These libraries are essential tools for data analysis and manipulation, and mastering them will greatly improve your ability to handle large datasets efficiently.