# Creation

## `list`

```python
df = pd.DataFrame(list[int], columns=['colname'])
```

## `dict`

- Keys become column names, values become column data


```python
data = {
    "colname1": list[int],
    "colname2": list[str],
}
df = pd.DataFrame(data)
```

## `list[dict]`

```python
data = [
    {"colname1": str, "colname2": int, "colname3": float},
    {"colname1": str, "colname2": int, "colname3": float},
    {"colname1": str, "colname2": int, "colname3": float},
]
df = pd.DataFrame(data)
```

## `dict[dict]`

- Outer dict keys become column names
- Inner dict keys become index values
- Each inner dict specifies a complete column

```python
data = {
    "colname1": {"index1": str, "index2": int, "index3": float},
    "colname2": {"index1": str, "index2": int, "index3": float},
    "colname3": {"index1": str, "index2": int, "index3": float},
}
df = pd.DataFrame(data)
```

## `pd.Series`

### Each `pd.Series` to be a column

When pandas combines series into a DataFrame:

- It uses the union of all indices as the index
- If a series doesn't have data for a particular index value, that cell will be filled with NaN
- Each series specifies a complete column

```python
data = {
    "colname1": pd.Series, 
    "colname2": pd.Series
}
df = pd.DataFrame(data)
```

### Each `pd.Series` to be the value of a cell

- If each series should be in same column, different row: `list[list[pd.Series]]`

```python
df = pd.DataFrame(
    [
        [pd.Series],
        [pd.Series]
    ],  
    columns=['colname1']
)
```

- If each series should be in same row, different column: `dict[str,list[pd.Series]]`

```python
df = pd.DataFrame(
    {
        "colname1": [pd.Series],
        "colname2": [pd.Series],
    }
)
```

## `pd.DataFrame`

### Each `pd.DataFrame` to be the value of a cell

- If each DataFrame should be in same column, different row: `list[list[pd.DataFrame]]`

```python
df = pd.DataFrame(
    [
        [pd.DataFrame],
        [pd.DataFrame]
    ],  
    columns=['colname1']
)
```

- If each DataFrame should be in same row, different column: `dict[str,list[pd.DataFrame]]`

```python
df = pd.DataFrame(
    {
        "colname1": [pd.DataFrame],
        "colname2": [pd.DataFrame],
    }
)
```

## Index

```python
df = pd.DataFrame(data, index=['row1', 'row2', 'row3'])
# or
df = pd.DataFrame(data, index=pd.Index(['row1', 'row2', 'row3']))
```

## Index with a Name

```python
df = pd.DataFrame(data, index=pd.Index(['row1', 'row2', 'row3'], name='indexname'))
```

# Examples

In [16]:
import pandas as pd

## Example: Creating DataFrame from a list

In [17]:
data = [1, 2, 3, 4, 5]
df = pd.DataFrame(data, columns=['Numbers'])
df

Unnamed: 0,Numbers
0,1
1,2
2,3
3,4
4,5


## Example: Creating DataFrame from a dictionary

In [18]:
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35]
}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35


## Example: Creating DataFrame from a list of dictionaries

In [19]:
data = [
    {"Name": "Alice", "Age": 25, "Salary": 50000.0},
    {"Name": "Bob", "Age": 30, "Salary": 60000.0},
    {"Name": "Charlie", "Age": 35, "Salary": 70000.0},
]
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Salary
0,Alice,25,50000.0
1,Bob,30,60000.0
2,Charlie,35,70000.0


## Example: Creating DataFrame from a dictionary of dictionaries

In [20]:
data = {
    "Name": {"1": "Alice", "2": "Bob", "3": "Charlie"},
    "Age": {"1": 25, "2": 30, "3": 35},
    "Salary": {"1": 50000.0, "2": 60000.0, "3": 70000.0},
}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Salary
1,Alice,25,50000.0
2,Bob,30,60000.0
3,Charlie,35,70000.0


## Example: Creating DataFrame from Series as columns

In [21]:
series1 = pd.Series([1, 2, 3], index=["a", "b", "c"])
series2 = pd.Series([4, 5, 6], index=["a", "b", "c"])
data = {
    "Column1": series1,
    "Column2": series2
}
df = pd.DataFrame(data)
df

Unnamed: 0,Column1,Column2
a,1,4
b,2,5
c,3,6


## Example: Creating DataFrame with Series as cell values (same column)

In [22]:
series1 = pd.Series([1, 2, 3])
series2 = pd.Series([4, 5, 6])
df = pd.DataFrame(
    [
        [series1],
        [series2]
    ],
    columns=['Column1']
)

df

Unnamed: 0,Column1
0,0 1 1 2 2 3 dtype: int64
1,0 4 1 5 2 6 dtype: int64


## Example: Creating DataFrame with Series as cell values (same row)

In [23]:
series1 = pd.Series([1, 2, 3])
series2 = pd.Series([4, 5, 6])
df = pd.DataFrame(
    {
        "Column1": [series1],
        "Column2": [series2],
    }
)

df

Unnamed: 0,Column1,Column2
0,0 1 1 2 2 3 dtype: int64,0 4 1 5 2 6 dtype: int64


## Example: Creating DataFrame with DataFrames as cell values (same column)

In [26]:
df1 = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df2 = pd.DataFrame({"A": [5, 6], "B": [7, 8]})
df = pd.DataFrame(
    [
        [df1],
        [df2]
    ],
    columns=['Column1']
)

df

Unnamed: 0,Column1
0,A B 0 1 3 1 2 4
1,A B 0 5 7 1 6 8


## Example: Creating DataFrame with DataFrames as cell values (same row)

In [25]:
df1 = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df2 = pd.DataFrame({"A": [5, 6], "B": [7, 8]})
df = pd.DataFrame(
    {
        "Column1": [df1],
        "Column2": [df2],
    }
)

df

Unnamed: 0,Column1,Column2
0,A B 0 1 3 1 2 4,A B 0 5 7 1 6 8


## Example: Creating DataFrame with custom index

In [28]:
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35]
}
df = pd.DataFrame(data, index=['person1', 'person2', 'person3'])
df

Unnamed: 0,Name,Age
person1,Alice,25
person2,Bob,30
person3,Charlie,35


## Example: Creating DataFrame with named index

In [29]:
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35]
}
df = pd.DataFrame(data, index=pd.Index(['person1', 'person2', 'person3'], name='PersonID'))
df

Unnamed: 0_level_0,Name,Age
PersonID,Unnamed: 1_level_1,Unnamed: 2_level_1
person1,Alice,25
person2,Bob,30
person3,Charlie,35
