## [Conversion](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#conversion)

In [19]:
import io
import pandas as pd
import numpy as np

In [7]:
data = io.StringIO("a,b\n,True\n2,")
data

<_io.StringIO at 0x7832036c1780>

In [8]:
df  = pd.read_csv(data)
df

Unnamed: 0,a,b
0,,True
1,2.0,


In [9]:
df.dtypes

a    float64
b     object
dtype: object

In [11]:
df_conv = df.convert_dtypes()
df_conv

Unnamed: 0,a,b
0,,True
1,2.0,


In [12]:
df_conv.dtypes

a      Int64
b    boolean
dtype: object

## [Inserting missing data](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#inserting-missing-data)
You can insert missing values by simply assigning to a `Series` or `DataFrame`. The missing value sentinel used will be chosen based on the dtype.

In [17]:
ser = pd.Series([1.,2.,3.])
print(f'ser = \n{ser}')
ser.loc[0] = None
print(f'\nser = \n{ser}')

ser = 
0    1.0
1    2.0
2    3.0
dtype: float64

ser = 
0    NaN
1    2.0
2    3.0
dtype: float64


In [25]:
ser  = pd.Series([pd.Timestamp('2021'),pd.Timestamp('2021')])
print(f'ser = \n{ser}')
ser.iloc[0] = np.nan
print(f'\nser = \n{ser}')

ser = 
0   2021-01-01
1   2021-01-01
dtype: datetime64[ns]

ser = 
0          NaT
1   2021-01-01
dtype: datetime64[ns]


In [27]:
ser = pd.Series([True, False], dtype = 'boolean[pyarrow]')
print(f'ser = \n{ser}')
ser.iloc[0] = None
print(f']nser = \n{ser}')

ser = 
0     True
1    False
dtype: bool[pyarrow]
]nser = 
0     <NA>
1    False
dtype: bool[pyarrow]


##### For object types, pandas will use the value given:

In [29]:
ser = pd.Series(list('abcde'), dtype = 'object')
print(f'ser = \n{ser}')
ser.iloc[1] = None
ser.iloc[2] = np.nan
ser.iloc[3] = pd.NA
print(f'\nser = \n{ser}')

ser = 
0    a
1    b
2    c
3    d
4    e
dtype: object

ser = 
0       a
1    None
2     NaN
3    <NA>
4       e
dtype: object


## [Calculations with missing data](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#calculations-with-missing-data)

In [32]:
ser1 = pd.Series([np.nan,np.nan, 3,4])
ser2 = pd.Series([1,np.nan,np.nan, 3])
print(f'ser1 = \n{ser1}')
print(f'\nser2 = \n{ser2}')

ser1 = 
0    NaN
1    NaN
2    3.0
3    4.0
dtype: float64

ser2 = 
0    1.0
1    NaN
2    NaN
3    3.0
dtype: float64


In [33]:
ser1 + ser2

0    NaN
1    NaN
2    NaN
3    7.0
dtype: float64

When summing data, NA values or empty data will be treated as zero.

In [35]:
pd.Series([np.nan, pd.NA]).sum()


0

In [39]:
pd.Series([], dtype='float64').sum()

np.float64(0.0)

In [40]:
pd.Series([np.nan, pd.NA]).prod()

1

In [41]:
pd.Series([], dtype="float64").prod()

np.float64(1.0)

Cumulative methods like `cumsum()` and `cumprod()` ignore NA values by default preserve them in the result. This behavior can be changed with `skipna = False`

In [46]:
ser = pd.Series([1,np.nan, 3, np.nan, np.nan, 5])
print(f'ser = \n{ser}')
print(f'\nser = \n{ser.cumsum()}')
print(f'\nser = \n{ser.cumsum(skipna=False)}')

ser = 
0    1.0
1    NaN
2    3.0
3    NaN
4    NaN
5    5.0
dtype: float64

ser = 
0    1.0
1    NaN
2    4.0
3    NaN
4    NaN
5    9.0
dtype: float64

ser = 
0    1.0
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
dtype: float64
