# An Introduction to Pandas
Pandas is a Python library, open-source in nature, that offers robust data structures and tools for data analysis. It is specifically designed to simplify and streamline data manipulation and analysis tasks, making them more efficient and straightforward. The name "Pandas" is derived from "Panel Data," which refers to multi-dimensional structured datasets commonly used in econometrics and finance [Pandas Developers, 2023].

## Data Structures:

1. **Series**: The Series is akin to a one-dimensional labeled array, similar to a NumPy array but with an associated index. This index provides meaningful labels for each element in the Series, allowing for effortless data alignment and retrieval [Pandas Developers, 2023].

2. **DataFrame**: The DataFrame is a two-dimensional labeled data structure, similar to a spreadsheet or SQL table. It consists of rows and columns, where each column can accommodate various data types. DataFrames provide a versatile and potent method for working with structured data, enabling operations such as filtering, joining, grouping, and more [Pandas Developers, 2023].


<font color='Blue'><b>Example - Series:</b></font> A Pandas Series object can be instantiated through the implementation of the [pd.Series](https://pandas.pydata.org/docs/reference/api/pandas.Series.html) constructor.

In [None]:
import pandas as pd

# Create a Pandas Series with custom index
data = pd.Series([10, 20, 30, 40], index=['A', 'B', 'C', 'D'])

# Print the Pandas Series
print("Pandas Series:")
print(data)

<font color='Blue'><b>Example - DataFrame:</b></font> A Pandas DataFrame object can be created by utilizing the [pd.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) constructor.

In [None]:
import pandas as pd

# Create a DataFrame from a dictionary
# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 22],
        'City': ['Calgary', 'Edmonton', 'Red Deer']
        }

df = pd.DataFrame(data)

# Display the DataFrame
print("DataFrame:")
display(df)

## Reading and Writing Data:

Pandas is a powerful Python library that provides data manipulation and analysis tools. It's widely used for tasks like reading and writing data in various formats. Here's how you can use Pandas to read and write data [Pandas Developers, 2023]:

### Reading Data

Pandas can read data from various file formats like CSV, Excel, SQL databases, and more. The most commonly used method is `pandas.read_csv()` for reading CSV files.

In [None]:
# download the following CSV file
!wget -N https://download.microsoft.com/download/4/C/8/4C830C0C-101F-4BF2-8FCB-32D9A8BA906A/Import_User_Sample_en.csv

In [None]:
import pandas as pd

# Read a CSV file into a DataFrame
data = pd.read_csv('Import_User_Sample_en.csv')

# Display the DataFrame
print("The DataFrame:")
display(data)

You can also read Excel files using `pandas.read_excel()`:

In [None]:
import pandas as pd

# Read an Excel file into a DataFrame
# Here the excel file is hosted on the web
data_excel = pd.read_excel('https://go.microsoft.com/fwlink/?LinkID=521962', sheet_name='Sheet1')

# Display the first five rows of the DataFrame
print("First five rows of the DataFrame:")
display(data_excel.head())

# The default behavior of Pandas' `head()` method is to display the first 5 rows of a DataFrame.

### Writing and Exporting Data

Pandas provides a versatile set of tools for exporting data to a variety of formats. One of the frequently employed techniques is using the `DataFrame.to_csv()` method, which facilitates the export of data to a CSV (Comma-Separated Values) file:

In [None]:
import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 22],
        'City': ['Calgary', 'Edmonton', 'Red Deer']
        }

df = pd.DataFrame(data)

# Write the DataFrame to a CSV file
csv_filename = 'data.csv'
df.to_csv(csv_filename, index=False)

# Print a message indicating that the data has been written
print(f"Data written to {csv_filename}")

For writing to Excel files, you can use `DataFrame.to_excel()`:

In [None]:
# Write DataFrame to an Excel file
df.to_excel('new_data.xlsx', index=False)

### Pandas Basics


| Command                    | Description                                                               |
|----------------------------|---------------------------------------------------------------------------|
| `pd.DataFrame(data)`       | Create a DataFrame from data like a dictionary, array, or list.           |
| `data.info()`              | Display basic information about the DataFrame, including data types and non-null counts. |
| `data.head(n)`             | Display the first n rows of the DataFrame (default is 5).                |
| `data.tail(n)`             | Display the last n rows of the DataFrame (default is 5).                 |
| `data.describe()`          | Display summary statistics of numerical columns (count, mean, std, min, max, quartiles). |
| `data.shape`               | Returns the number of rows and columns in the DataFrame as a tuple.      |
| `data.columns`             | Access the column labels of the DataFrame.                                |


In [None]:
import pandas as pd
import numpy as np

# Create the DataFrame
data = pd.DataFrame({'A': np.arange(0, 100),
                     'B': np.arange(1000, 900, -1)})

# Display basic DataFrame information
print("Displaying DataFrame Information:")
display(data)

In [None]:
# Get DataFrame information using data.info()
print("DataFrame Info:")
data.info()

In [None]:
# Display the first 10 rows
print(f"Displaying First 5 Rows:")
display(data.head())

In [None]:
# Display the last 10 rows
print(f"Displaying Last 5 Rows:")
display(data.tail())

In [None]:
# Display summary statistics
print("\nSummary Statistics:")
display(data.describe())

In [None]:
# Get the shape of the DataFrame
print("DataFrame Shape:")
print("Number of rows and columns:", data.shape)

In [None]:
# Get column labels
print("Column Labels:")
print("Column labels:", data.columns)

These are just a few basic commands in Pandas. The library offers a wide range of functions for data manipulation, exploration, and analysis. You can refer to the official Pandas documentation for more details and examples: [Pandas Documentation](https://pandas.pydata.org/pandas-docs/stable/index.html).