# General Explanation of .shape
## Definition:
- .shape is an attribute available in NumPy arrays, Pandas DataFrames, and some other Python libraries for data handling.
- It provides a tuple that describes the dimensions (or shape) of the object.

# Usage in Different Contexts

## For NumPy Arrays
- The .shape attribute returns a tuple where:
    - First value: Number of rows (or the size of the first dimension).
    - Second value: Number of columns (or the size of the second dimension).
    - Additional values (if applicable): Sizes of higher dimensions (for 3D, 4D arrays, etc.).

In [1]:
import numpy as np

# 2D array with 3 rows and 4 columns
array = np.array([[1, 2, 3, 4],
                  [5, 6, 7, 8],
                  [9, 10, 11, 12]])

print(array.shape)  # Output: (3, 4)


(3, 4)


# For Pandas DataFrames
- The .shape attribute also returns a tuple:
    - First value: Number of rows (records).
    - Second value: Number of columns (features or variables).

![](img/shape.png)

In [4]:
import pandas as pd

# DataFrame with 3 rows and 2 columns
data = pd.DataFrame({'A': [1, 2, 3],
                     'B': [4, 5, 6]})
print(data)
print()
print(data.shape)  # Output: (3, 2)


   A  B
0  1  4
1  2  5
2  3  6

(3, 2)


# From a List

In [6]:
import pandas as pd

# Create a DataFrame from a list of lists
data = [[1, 2], [3, 4], [5, 6]]
df = pd.DataFrame(data)

print(df)
print("Shape:", df.shape)


   0  1
0  1  2
1  3  4
2  5  6
Shape: (3, 2)


# From a List of Tuples

The line `columns=['Number', 'Letter']` is used to **name the columns** of the DataFrame. 


In [7]:
# Create a DataFrame from a list of tuples
data = [(1, 'A'), (2, 'B'), (3, 'C'), (4, 'D')]
df = pd.DataFrame(data, columns=['Number', 'Letter'])

print(df)
print("Shape:", df.shape)


   Number Letter
0       1      A
1       2      B
2       3      C
3       4      D
Shape: (4, 2)



## Without columns=['Number', 'Letter']
 Pandas would automatically assign default column names (such as integers starting from 0). Including this line allows you to customize and assign meaningful names to the columns of the DataFrame.

In [8]:
# Create a DataFrame from a list of tuples
data = [(1, 'A'), (2, 'B'), (3, 'C'), (4, 'D')]
df = pd.DataFrame(data)

print(df)
print("Shape:", df.shape)

   0  1
0  1  A
1  2  B
2  3  C
3  4  D
Shape: (4, 2)


# From a Single List

In [9]:
# Create a DataFrame from a single list
data = [10, 20, 30, 40]
df = pd.DataFrame(data, columns=['Value'])

print(df)
print("Shape:", df.shape)


   Value
0     10
1     20
2     30
3     40
Shape: (4, 1)


# From a List of Dictionaries

- Each dictionary becomes a row.
- Columns are determined by the keys across all dictionaries.
- Missing values are filled with **NaN**.

In [10]:
# Create a DataFrame from a list of dictionaries
data = [{'Name': 'Alice', 'Age': 25}, 
        {'Name': 'Bob', 'Age': 30, 'City': 'NYC'}, 
        {'Name': 'Charlie', 'City': 'LA'}]
df = pd.DataFrame(data)

print(df)
print("Shape:", df.shape)


      Name   Age City
0    Alice  25.0  NaN
1      Bob  30.0  NYC
2  Charlie   NaN   LA
Shape: (3, 3)


# From a 2D Index and Value Combinations

In [11]:
# Create a DataFrame using a matrix-like structure
index = ['Row1', 'Row2']
columns = ['Col1', 'Col2', 'Col3']
data = [[1, 2, 3], [4, 5, 6]]
df = pd.DataFrame(data, index=index, columns=columns)

print(df)
print("Shape:", df.shape)


      Col1  Col2  Col3
Row1     1     2     3
Row2     4     5     6
Shape: (2, 3)
