# DTypes


In Pandas, `dtype` stands for "data type," and it refers to the type of data stored in a DataFrame or Series. Understanding `dtype` is crucial for effectively managing and analyzing data, as it determines how Pandas interprets and operates on the data.

### Key Points About `dtype`:

1. **Data Type Identification**:

   - **DataFrame**: You can check the data types of each column in a DataFrame using the `dtypes` attribute.
   - **Series**: Each Series has a single data type, which you can check using the `dtype` attribute.

2. **Common Data Types**:

   - **Integer**: `int64`, `int32`, etc.
   - **Float**: `float64`, `float32`, etc.
   - **Object**: This is a generic data type often used for strings or mixed types.
   - **Datetime**: `datetime64[ns]` for date and time information.
   - **Boolean**: `bool` for True/False values.
   - **Category**: `category` for categorical data with a fixed number of possible values.

3. **Setting Data Types**:

   - **Explicit Conversion**: You can convert the data type of a column using the `astype()` method.
     ```python
     df['column_name'] = df['column_name'].astype('float64')
     ```

4. **Default Data Types**:

   - Pandas often infers data types based on the data provided. For example, numeric columns are typically inferred as `float64` or `int64`, while text columns are inferred as `object`.

5. **Handling Missing Data**:
   - Missing data is represented by `NaN` (Not a Number) for numeric types and `None` for object types.


In [1]:
import pandas as pd

In [2]:
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [1.1, 2.2, 3.3],
    'C': ['Daffa Ilhami', 'Imanima', 'UWU'],
    'D': ['b', 'a', 'c'],
    'E': [pd.Timestamp('2024-01-01'), pd.Timestamp('2024-01-02'), pd.Timestamp('2024-01-03')],
    'F': [True, False, True]
})

df

Unnamed: 0,A,B,C,D,E,F
0,1,1.1,Daffa Ilhami,b,2024-01-01,True
1,2,2.2,Imanima,a,2024-01-02,False
2,3,3.3,UWU,c,2024-01-03,True


#### Checking DTypes


In [3]:
df.dtypes

A             int64
B           float64
C            object
D            object
E    datetime64[ns]
F              bool
dtype: object

#### Converting DTypes of a DataFrame


In [None]:
# Convert all columns
# # df.astype('int32')

# Convert specific columns
a = df.astype({'A': 'float32', 'B': 'int32'})

display(a.dtypes)
display(a)

A           float32
B             int32
C            object
D            object
E    datetime64[ns]
F              bool
dtype: object

Unnamed: 0,A,B,C,D,E,F
0,1.0,1,Daffa Ilhami,b,2024-01-01,True
1,2.0,2,Imanima,a,2024-01-02,False
2,3.0,3,UWU,c,2024-01-03,True
