In [None]:
# importing pandas
import pandas as pd

# csv file location
url = 'https://dq-content.s3.amazonaws.com/291/f500.csv'

# making data frame from csv file
data = pd.read_csv(url, index_col = 'company')

# Handling Missing Data

A dataset may have entries that are `NA`. These are values such as `null`, `None` or `NaN` (Not a Number).

Pandas uses a sentinel value (`None` or `NaN`) to indicate a missing entry.

# **`None`** sentinel value

The first sentinel value used by Pandas is `None`, a Python singleton object that is often used for missing data in Python code.

If you perform aggregations like `sum()` or `min()` across an array with a `None` value, you will generally get an error.

# **`NaN`** missing numerical data

The other sentinel value used by Pandas is `NaN`, which is a special floating-point value recognized by all systems that use the standard IEEE floating-point representation.

Regardless of the operation, the result of arithmetic with `NaN` will be another `NaN`.

NumPy does provide `NaN` safe versions of its aggregation functions, such as `np.nansum()`, `np.nanmin()`, `np.nanmax()`, that will ignore these missing values .

# **`NaN`** and **`None`** in Pandas

`NaN` and `None` both have their place, and Pandas is built to handle the two of them nearly interchangeably, converting between them where appropriate.

# Operating on **`null`** Values

Pandas treats `None` and `NaN` as essentially interchangeable for indicating missing or `null` values.

To facilitate this convention, there are several useful methods for detecting, removing, and replacing `null` values in Pandas data structures.

They are:

* `isnull()` : generates a Boolean mask indicating missing values.
* `notnull()` : opposite of `isnull()`.
* `dropna()` : returns a filtered version of the data.
* `fillna()` : returns a copy of the data with missing values filled or imputed.

**Example**

In [None]:
# data filter
data_filter = data['profits'].isnull()

# data selection
data_selection = data[data_filter]['profits']

data_selection

Unnamed: 0_level_0,profits
company,Unnamed: 1_level_1
Heraeus Holding,


## Dropping **`null`** values

The method `dropna()` is used to remove `NA` values. It returns a filtered version of the data.

By default, it will drop all rows in which any null value is present.

Alternatively, you can drop `NA` values along a different axis. For example, `axis=1` will drop all columns containing a `null` value:

> *Syntax*
```python
dropna(axis='columns')
```

## Filling **`null`** values

The method `fillna()` is used to fill in `NA` values. It returns a copy of the array with the `null` values replaced with a specified value. For example, `data.fillna(0)` will fill in `NA` values with 0.

