# 🐼 Pandas Notes
**Spring 2025 – Data Practice Journal**

This notebook contains notes and examples from working through the **Pandas** section of Codecademy's Analyze Financial Data with Python course.  

_Note: This notebook evolves over time as I continue to practice and build fluency._

In [1]:
import pandas as pd

In [2]:
import numpy as np

## DataFrames

### Create a DataFrame

A DataFrame is an object that stores data as rows and columns. It can be created manually or filled with data from a CSV, Excel spreadsheet, or SQL query.

DataFrames have rows and columns. Each column has a name, which is a string. Each row has an index, which is an integer.

DataFrames may contain different data types such as strings, ints, floats, tuples, etc.

A dictionary can be passed into pd.DataFrame(). Each key is a column name and each value is a list of column values. The columns must all be the same length. Example:

In [4]:
df1 = pd.DataFrame({
    'name': ['John Smith', 'Jane Doe', 'Joe Schmo'],
    'address': ['123 Main St.', '456 Maple Ave.', '789 Broadway'],
    'age': [34, 28, 51]
})

print(df1)

         name         address  age
0  John Smith    123 Main St.   34
1    Jane Doe  456 Maple Ave.   28
2   Joe Schmo    789 Broadway   51


Data can also be added using *lists*.

For example, you can pass in a list of lists, where each one represents a row of data. Use the keyword `columns` to pass a list of column names.

In [5]:
df2 = pd.DataFrame([
    ['John Smith', '123 Main St.', 34],
    ['Jane Doe', '456 Maple Ave.', 28],
    ['Joe Schmo', '789 Broadway', 51]
    ],
    columns=['name', 'address', 'age'])

print(df2)

         name         address  age
0  John Smith    123 Main St.   34
1    Jane Doe  456 Maple Ave.   28
2   Joe Schmo    789 Broadway   51


## DataFrames: Loading and Saving CSVs

To load CSV data into a DataFrame in Pandas, use `.read_csv()`

In [11]:
# sample_data_frame = pd.read_csv('sample.csv')
# In this example, we read data from an existing CSV file into a variable called sample_data_frame
# (Not a real CSV file, code will not run.)

FileNotFoundError: [Errno 2] No such file or directory: 'sample.csv'

To save data to a CSV, use `.to_csv()`

In [12]:
df1.to_csv('new-csv-file.csv')
# In this example, df1 is the DataFrame object on wich the .to_csv() method is called.

## Inspect a DataFrame
If a DataFrame is small, you can display it using `print(df)`.

If it's a larger DataFrame, it's helpful to be able to inspeact a few items without having to look at the entire DataFrame.

The method `.head()` displays the first 5 rows of a DataFrame. If you want to see more rows, you can pass in the positional argument `n`. For example, `df.head(10)` would show the first 10 rows.

The method **`df.info()`** gives some statistics for each column.

In [13]:
print(df1.head())

         name         address  age
0  John Smith    123 Main St.   34
1    Jane Doe  456 Maple Ave.   28
2   Joe Schmo    789 Broadway   51


In [14]:
print(df1.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   name     3 non-null      object
 1   address  3 non-null      object
 2   age      3 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 204.0+ bytes
None
