## What is a DataFrame?

A DataFrame is a 2-dimensional tabular data structure (like a spreadsheet) in pandas, used for storing and manipulating data in rows and columns.

## 1. From a Python Dictionary (dict)
This is the most common method when you're manually creating small datasets.

**Example 1: Dictionary with Lists**

In [1]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Score': [85.5, 90.0, 95.0]
}

df = pd.DataFrame(data)
print(df)

      Name  Age  Score
0    Alice   25   85.5
1      Bob   30   90.0
2  Charlie   35   95.0


**Notes:**

* Each key becomes a column.
* Each value (list) becomes the column’s data.
* Great for structured data when you already have it in a dictionary.

## 2. From a List of Dictionaries

Each dictionary is treated as a row (observation).

**Example 2:**

In [2]:
data = [
    {'Name': 'Alice', 'Age': 25, 'Score': 85.5},
    {'Name': 'Bob', 'Age': 30, 'Score': 90.0},
    {'Name': 'Charlie', 'Age': 35, 'Score': 95.0}
]

df = pd.DataFrame(data)
print(df)

      Name  Age  Score
0    Alice   25   85.5
1      Bob   30   90.0
2  Charlie   35   95.0


**Notes:**
  
* More flexible when rows have similar structure.
* Used often when collecting data from APIs (JSON format).

## 3. From a List of Lists (or Tuples)

Useful when you just have raw data but need to specify column names.

**Example 3:**

In [3]:
data = [
    ['Alice', 25, 85.5],
    ['Bob', 30, 90.0],
    ['Charlie', 35, 95.0]
]

columns = ['Name', 'Age', 'Score']

df = pd.DataFrame(data, columns=columns)
print(df)

      Name  Age  Score
0    Alice   25   85.5
1      Bob   30   90.0
2  Charlie   35   95.0


**Notes:**
  
* Each list/tuple = one row.
* You must pass column names.

## 4. From Numpy Arrays

Perfect for ML and scientific computing since data is often numeric.

**Example 4:**

In [4]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

df = pd.DataFrame(arr, columns=['Feature1', 'Feature2', 'Feature3'])
print(df)

   Feature1  Feature2  Feature3
0         1         2         3
1         4         5         6


**Notes:**
  
* Useful for converting model output into a labeled format.
* Can be used with .values from other DataFrames too.

## 5. From Pandas Series

A DataFrame is a group of Series objects.

**Example 5:**

In [8]:
s1 = pd.Series([10, 20, 30], name='X')
s2 = pd.Series([40, 50, 60], name='Y')

df = pd.concat([s1, s2], axis=1)
print(df)

    X   Y
0  10  40
1  20  50
2  30  60


**Notes:**

* Very useful when transforming columns one at a time.
* Series can come from computations or filtering.

## 6. From a CSV or Excel File

Very common in real-world ML workflows.

**Example 6:**

In [10]:
df = pd.read_csv("Dataset/titanic.csv")  # Make sure the file exists
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


**Notes:**
  
* Use .read_csv() or .read_excel() for importing data.
* You’ll often clean or preprocess this data before using it.

## 7. From a Dictionary of Series

You can mix column types this way.

***Example 7:**

In [12]:
data = {
    'col1': pd.Series([1, 2, 3]),
    'col2': pd.Series(['A', 'B', 'C'])
}

df = pd.DataFrame(data)
print(df)

   col1 col2
0     1    A
1     2    B
2     3    C


## 8. Using from_records()

Handy when you want more control over structure.

**Example 8:**

In [13]:
data = [
    (1, 'Alice', 85.5),
    (2, 'Bob', 90.0)
]

df = pd.DataFrame.from_records(data, columns=['ID', 'Name', 'Score'])
print(df)

   ID   Name  Score
0   1  Alice   85.5
1   2    Bob   90.0


## 9. Using from_dict() with orientation

**Example 9:**

In [14]:
data = {
    'row1': [1, 2],
    'row2': [3, 4]
}

df = pd.DataFrame.from_dict(data, orient='index', columns=['A', 'B'])
print(df)

      A  B
row1  1  2
row2  3  4


**Notes:**
  
* orient='index' treats dictionary keys as row labels.
* Useful in hierarchical data scenarios.

## 10. Empty DataFrame (and Filling Later)

Good for initializing a structure to append to.

**Example 10:**

In [15]:
df = pd.DataFrame(columns=['A', 'B', 'C'])
df.loc[0] = [1, 2, 3]
print(df)

   A  B  C
0  1  2  3


## When to Use Which?

| Use Case | Method |
| --- | --- |
| Manual entry / small data | dict or list of dicts |
| Numeric arrays (ML input/output) | numpy |
| Large files / external data | read_csv, read_excel |
| Building up row by row | Empty + append / .loc[] |
| From APIs / JSON | list of dicts or from_dict() |