## Object, Datetime Datatype
#### Here are some characteristics of the object dtype:
Heterogeneous Data: It can contain a mix of data types within a single column.
- Flexibility: It can store a wide range of data types, making it versatile for handling diverse data.
- Performance Considerations: Operations on object columns can be slower compared to other data types due to the heterogeneity of data.
- Memory Usage: It can consume more memory compared to other data types, especially when dealing with large strings or complex objects.

#### Common use cases for the object dtype include:
- Unstructured Data: Storing text data, such as descriptions, comments, or URLs.
- Mixed Data: Handling columns that contain a combination of numeric and non-numeric values.
- Intermediate Calculations: Storing temporary results during data processing before converting to more specific data types.

In [1]:
import pandas as pd

In [9]:

# Make a DataFrame from with integer data
data = {
    'Age':[22, 21, 23, 37, 31, 61, 45, 41, 32, 18],
    'Salary':[50000, 60000, 70000, 80000, 90000, 100000, 100100, 120000, 130000, 140000]
}
df = pd.DataFrame(data)
df

Unnamed: 0,Age,Salary
0,22,50000
1,21,60000
2,23,70000
3,37,80000
4,31,90000
5,61,100000
6,45,100100
7,41,120000
8,32,130000
9,18,140000


In [10]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Age     10 non-null     int64
 1   Salary  10 non-null     int64
dtypes: int64(2)
memory usage: 292.0 bytes


In [11]:
df.dtypes

Age       int64
Salary    int64
dtype: object

In [12]:
df.columns

Index(['Age', 'Salary'], dtype='object')

In [13]:
# Make data type to int32
df["Age"] = df["Age"].astype('int32')
df["Salary"] = df["Salary"].astype('int32')
df

Unnamed: 0,Age,Salary
0,22,50000
1,21,60000
2,23,70000
3,37,80000
4,31,90000
5,61,100000
6,45,100100
7,41,120000
8,32,130000
9,18,140000


In [14]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   Age     10 non-null     int32
 1   Salary  10 non-null     int32
dtypes: int32(2)
memory usage: 212.0 bytes


In [15]:
2**32

4294967296

In [16]:
4294967296 / 2

2147483648.0

In [None]:
# - 2147483648.0 to 2147483647.0 (range)

In [17]:
# Make data type to int32
df["Age"] = df["Age"].astype('int16')
df["Salary"] = df["Salary"].astype('int16')
df

Unnamed: 0,Age,Salary
0,22,-15536
1,21,-5536
2,23,4464
3,37,14464
4,31,24464
5,61,-31072
6,45,-30972
7,41,-11072
8,32,-1072
9,18,8928


In [18]:
2**15

32768

In [None]:
# -32768 to 32768 range=(4464, 4464, 24464, 8928)

## Same floating data type

## Objects & DateTime Datatype

In [33]:
dates = ['2023-10-03', '2023-10-06', '2025-11-01'] 
df1 = pd.DataFrame({'Date': dates})
df1['Date'].dtype

dtype('O')

In [35]:
timestamps = pd.to_datetime(dates)
df2 = pd.DataFrame({'Date2': timestamps})

In [36]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Date2   3 non-null      datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 156.0 bytes


In [42]:
# Create a list of dates
dates = ['2023', '2024-05-9', '2025'] 

# Create a DataFrame with a datetime column
df1 = pd.DataFrame({'Date': dates})

# timestamps = pd.to_datetime(dates) 
# df2 = pd.DataFrame({'Date2': timestamps})
print(dates)

# Check the data type of the 'Date' column
df1['Date'].dtype


['2023', '2024-05-9', '2025']


dtype('O')

In [44]:
df2

Unnamed: 0,Date2
0,2023-10-03
1,2023-10-06
2,2025-11-01


In [46]:
dates = ['2023-10-04 12:34:56.123456789', '2023-11-15 15:23:45.987654321']
timestamps = pd.to_datetime(dates)
df = pd.DataFrame({'Date': timestamps})
print(df['Date'].dtype)
# datetime64[ns]

datetime64[ns]


In [47]:
df

Unnamed: 0,Date
0,2023-10-04 12:34:56.123456789
1,2023-11-15 15:23:45.987654321
