# Regular expression implemented in Python for SQLite 

> The REGEXP operator is a special syntax for the regexp() user function. … the "X REGEXP Y" operator will be implemented as a call to "regexp(Y,X)".

Source: [SQL As Understood By SQLite: The LIKE, GLOB, REGEXP, and MATCH operators](http://www.sqlite.org/lang_expr.html#like)

> SQLite does not contain regular expression functionality by default.
> 
> It defines a `REGEXP` operator, but this will fail with an error message unless you or your framework [define a user function](http://www.sqlite.org/c3ref/create_function.html) called `regexp()`. How you do this will depend on your platform.

Source: [How do I use regex in a SQLite query?](https://stackoverflow.com/questions/5071601/) 

## Code adapted from these sources: 

1. [sqlite3 : create function regexp with python](https://stackoverflow.com/questions/50063058/) (could not reproduce output with code provided in example) 

2. [Problem with regexp python and sqlite](https://stackoverflow.com/questions/5365451)

In [1]:
import sqlite3
import re # Regular expression operations

In [2]:
def regexp(expr, item):
    reg = re.compile(expr)
    return reg.search(item) is not None

In [3]:
conn = sqlite3.connect('data/sample.db')
cursor = conn.cursor()

In [4]:
conn.create_function("REGEXP", 2, regexp)

In [5]:
cursor.execute('SELECT colorname FROM answers WHERE colorname REGEXP ?', ['[a-z]'])

<sqlite3.Cursor at 0x7f139c7d0960>

In [6]:
data = cursor.fetchall()

In [7]:
print(data)

[('aubergine',), ('baby blue',), ('black',), ('blue',), ('blue',), ('blue',), ('blue',), ('blue',), ('blue',), ('bright green',), ('brown',), ('carrot',), ('cornflower blue',), ('cyan',), ('dark green',), ('dark iron',), ('dark orange',), ('dark tan',), ('faint violet',), ('goldenrod',), ('gray',), ('green',), ('green',), ('green',), ('light blue',), ('light chocolate',), ('light purple',), ('lime green',), ('magenta',), ('moss green',), ('mud',), ('mustard',), ('navy',), ('navy blue',), ('olive green',), ('orange',), ('pale green',), ('pale rose',), ('pastel blue',), ('pink',), ('purple',), ('purple',), ('purple',), ('purple',), ('purplish blue',), ('red',), ('red brown',), ('teal',), ('wedgwood blue',)]


In [8]:
len(data)

49

In [9]:
cursor.execute('SELECT colorname FROM names WHERE colorname REGEXP ?', ['[a-z]'])
data = cursor.fetchall()
print(data)

[("                                                                                                      ok, now you're screwing with me",), ("                                                                               kasclaknvlzkxmnv;dfgojxx;kjvbx.vbzvlmzcnvl;dhgahxc ,,cb/lfkgjas;hgihsfblzxcb,sngas'dgpsidjga;dflkasfgjafg'afdga;dfgadfglkdjfg;dlfkjgafdg;lkadfgja'fg;akdfjgadfl'gdfakgjdfl'gkjdfg/m,nbv.vbmnfb'dfgkldfhdgf';fg'adfg;kjf';lgfkgs",), ('                                 (o")',), ('                                ("o)',), ('                               ("o)',), ('                               (o")',), ('                         >>==n;)',), ('                        (o")',), ('                      ("o)',), ('                     >>===="o)',), ('                   (o")',), ('                  >>====>("o)',), ('                (o")',), ('               >>====>   ("o)',), ('               yellow',), ('             ("o)',), ('          (o")',), ('         (o")',), ('       ("o)

In [10]:
len(data)

44

In [11]:
# trim() removes whitespaces 
# see https://www.techonthenet.com/sqlite/functions/trim.php
cursor.execute('SELECT trim(colorname) AS trimmed_colorname FROM names WHERE trimmed_colorname REGEXP ?', ['[:alpha:]'])
data = cursor.fetchall()
print(data)

[("ok, now you're screwing with me",), ("kasclaknvlzkxmnv;dfgojxx;kjvbx.vbzvlmzcnvl;dhgahxc ,,cb/lfkgjas;hgihsfblzxcb,sngas'dgpsidjga;dflkasfgjafg'afdga;dfgadfglkdjfg;dlfkjgafdg;lkadfgja'fg;akdfjgadfl'gdfakgjdfl'gkjdfg/m,nbv.vbmnfb'dfgkldfhdgf';fg'adfg;kjf';lgfkgs",), ('yellow',), ('what would you call this color?',), ('cornflowerblue',), ('h',), ('orange',), ('#38acec',), ('#dd00ff (yes i looked it up)',), ('68f79oibjl',), ('a',), ('a cross between terra cotta and light pink',), ('a little bit light blue',), ('a mix of blue and purple',)]


In [12]:
len(data)

14

In [13]:
cursor.execute('SELECT trim(colorname) AS trim_name FROM names WHERE trim_name REGEXP ?', ['^[a-z].*[a-z]$'])
data = cursor.fetchall()
print(data)

[("ok, now you're screwing with me",), ("kasclaknvlzkxmnv;dfgojxx;kjvbx.vbzvlmzcnvl;dhgahxc ,,cb/lfkgjas;hgihsfblzxcb,sngas'dgpsidjga;dflkasfgjafg'afdga;dfgadfglkdjfg;dlfkjgafdg;lkadfgja'fg;akdfjgadfl'gdfakgjdfl'gkjdfg/m,nbv.vbmnfb'dfgkldfhdgf';fg'adfg;kjf';lgfkgs",), ('yellow',), ('bmb',), ('cornflowerblue',), ('n mn',), ('orange',), ('turquoise',), ('a cross between terra cotta and light pink',), ('a little bit light blue',), ('a mix of blue and purple',)]


In [14]:
len(data)

11