# Selection: Regex

No python data wrangling framework would be complete without a nod to regular expressions.

The full syntax of regex is beyond the scope of this tutorial (but you can learn more [here](https://regexone.com/) or via many online sources).

This page is just some simple example on how to use regex in relation to datachef cells.

## Source Data

The data source we're using for these examples is shown below:

| <span style="color:green">Note - this particular table has some very verbose headers we don't care about, so we'll be using `bounded=` to remove them from the previews as well as to show just the subset of data we're working with.</span>|
|-----------------------------------------|

The [full data source can be downloaded here](https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx). We'll be using th 10th tab named "Table 3c".

In [None]:
from datachef import acquire, preview
from datachef.selection import XlsxSelectable

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 3c")
preview(table, bounded="A4:H10")

## Simple Regex Examples

The following are simple examples of how to use regex with datachef selections.

Note, for brevity we use the common shorthand `re` for regex.

In [None]:
from datachef import acquire, preview
from datachef.selection import XlsxSelectable

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 3c")

# cells beginning with a capital M
m_cells = table.re("M.*").label_as("Cells starting with a capital M")

# cells containing the word "housing"
housing = table.re(".*housing.*").label_as("Cells containing the word housing")

# cells containing the word "work"
work = table.re(".*work.*").label_as("Cells containing the word work")

# cells ending in a year
year = table.re(".*[0-9][0-9][0-9][0-9]").label_as("Cells ending in a year")

preview(m_cells, housing, work, year, bounded="A4:H10")