Currently IPv4 and IPv6 are supported as valid input.

The IP addresses can be converted into any of the following desired formats:
* `compressed`: provides a compressed version of the ip address,
* `full`: provides full version of the ip address,
* `binary`: provides binary representation of the ip address,
* `hexa`: provides hexadecimal representation of the ip address,
* `integer`: provides integer representation of the ip address,
* `packed`: provides packed binary representation of the ip address.

The default output format is `compressed`.

Invalid parsing is handled with the `errors` parameter:

* "coerce" (default): invalid parsing will be set to NaN
* "ignore": invalid parsing will return the input
* "raise": invalid parsing will raise an exception

After cleaning, a **report** is printed that provides the following information:

* How many values were cleaned (the value must have been transformed).
* How many values could not be parsed.
* A summary of the cleaned data: how many values are in the correct format, and how many values are NaN.

### An example dataset containing ip addresses

In [None]:
import pandas as pd
df = pd.DataFrame({
    "ips": [
        "00.000.0.0", "455.0.0.0", None, 876234, {}, "00.12.021.255",
        "684D:1111:222:3333:4444:5555:6:77", b'\xc9\xdb\x10\x00'
    ]
})
df

## 1. Default `clean_ip`

By default, `clean_ip` will clean ip addresses in IPv4 and IPv6 and output them in the compressed format.

In [None]:
from dataprep.clean import clean_ip
clean_ip(df, "ips")

## 2. Input formats

This section demonstrates the input parameter.

### `ipv4`

Will parse only IPv4 addresses.

In [None]:
clean_ip(df, "ips", input_format="ipv4")

### `ipv6`

Will parse only IPv6 address. 

In [None]:
clean_ip(df, "ips", input_format="ipv6")

### `auto` (default parameter)

Will parse both IPv4 and IPv6 addresses.

In [None]:
clean_ip(df, "ips", input_format="auto")

## 3. Output formats

### `compressed` (default)

In [None]:
clean_ip(df, "ips", output_format="compressed")

### `full`

In [None]:
clean_ip(df, "ips", output_format="full")

### `binary`

In [None]:
clean_ip(df, "ips", output_format="binary")

### `hexa`

In [None]:
clean_ip(df, "ips", output_format="hexa")

### `integer`

In [None]:
clean_ip(df, "ips", output_format="integer")

### `packed`

In [None]:
clean_ip(df, "ips", output_format="packed")

## 3. `errors` parameter

### `coerce` (default)

In [None]:
clean_ip(df, "ips", errors="coerce")

### `ignore`

In [None]:
clean_ip(df, "ips", errors="ignore")

## 4. `validate_ip()`

`validate_ip()` returns `True` if the input is a valid IP, otherwise `False`.

In [None]:
from dataprep.clean import validate_ip

print(validate_ip("455.0.0.0"))
print(validate_ip({}))
print(validate_ip(" "))
print(validate_ip("0.0.0.0"))
print(validate_ip("684D:1111:222:3333:4444:5555:6:77"))

In [None]:
df_2 = validate_ip(df["ips"])
df_2