# Selection: Regex

No python parsing library would be complete without a nod to Regular expressions.

The full syntax of regex is beyond the scope of this tutorial (but you can learn more [here](https://regexone.com/) or via many online sources).

This page is just some simple example on how to use regex in relation to datachef cells.

## Source Data

The data source we're using for these examples is shown below:

| <span style="color:green">Note - this particular table has some very verbose headers we don't care about, so we'll be using `bounded=` to remove them from the previews as well as to show just the subset of data we're working with.</span>|
|-----------------------------------------|

The [full data source can be downloaded here](https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx). We'll be using th 10th tab named "Table 3c".

In [1]:
from typing import List
from datachef import acquire, preview
from datachef.selection import XlsxSelectable

tables: List[XlsxSelectable] = acquire.xlsx.http("https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx")
preview(tables[9], bounded="A4:H10")

0,1,2,3,4,5,6,7,8
,A,B,C,D,E,F,G,H
4.0,Percentage change 3 months on previous 3 months,,,,,,,
5.0,Time period,Public new housing,Private new housing,Total new housing,Infrastructure new work,Public other new work,Private industrial new work,Private commercial new work
6.0,Dataset identifier code,MVO6,MVO7,MVO8,MVO9,MVP2,MVP3,MVP4
7.0,Jun 2010,5.6,9.8,8.8,3,4.3,3.7,1.9
8.0,Jul 2010,2,5.6,4.8,0.2,-0.2,9.7,3.5
9.0,Aug 2010,5.5,4.5,4.7,-2.9,-2.9,24.4,5.9
10.0,Sep 2010,11.7,7.5,8.5,-6.8,-3.3,16.1,5.3


## Simple Regex Examples

The following are simple examples of how to use regex with datachef selections.

Note, for brevity we use the common shorthand `re` for regex.

In [9]:
from typing import List
from datachef import acquire, preview
from datachef.selection import XlsxSelectable

tables: List[XlsxSelectable] = acquire.xlsx.http("https://github.com/mikeAdamss/datachef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx")
table = tables[9]

# cells beginning with a capital M
m_cells = table.re("M.*").label_as("Cells starting with a capital M")

# cells containing the word "housing"
housing = table.re(".*housing.*").label_as("Cells containing the word housing")

# cells containing the word "work"
work = table.re(".*work.*").label_as("Cells containing the word work")

# cells ending in a year
year = table.re(".*[0-9][0-9][0-9][0-9]").label_as("Cells ending in a year")

preview(m_cells, housing, work, year, bounded="A4:H10")

0
Cells starting with a capital M
Cells containing the word housing
Cells containing the word housing
Cells ending in a year

0,1,2,3,4,5,6,7,8
,A,B,C,D,E,F,G,H
4.0,Percentage change 3 months on previous 3 months,,,,,,,
5.0,Time period,Public new housing,Private new housing,Total new housing,Infrastructure new work,Public other new work,Private industrial new work,Private commercial new work
6.0,Dataset identifier code,MVO6,MVO7,MVO8,MVO9,MVP2,MVP3,MVP4
7.0,Jun 2010,5.6,9.8,8.8,3,4.3,3.7,1.9
8.0,Jul 2010,2,5.6,4.8,0.2,-0.2,9.7,3.5
9.0,Aug 2010,5.5,4.5,4.7,-2.9,-2.9,24.4,5.9
10.0,Sep 2010,11.7,7.5,8.5,-6.8,-3.3,16.1,5.3
