##### Python for High School (Summer 2022)

* [Table of Contents](PY4HS.ipynb)
* <a href="https://colab.research.google.com/github/4dsolutions/elite_school/blob/master/Py4HS_August_16_2022.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a>
* [![nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/4dsolutions/elite_school/blob/master/Py4HS_August_16_2022.ipynb)

### Regular Expressions

Regular Expressions have the potential to become an entertaining passtime, if not a full-fledged hobby.  Python's implementation may not be the very best (in the sense of complete) out there, but it's certainly world class.

From Wikipedia:

<blockquote>
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory. 
</blockquote>
    
Regexes were not featured as a part of the relic high school *mathematics* textbooks we've considered.  However they may have been included in an optional [computer science track](https://www.w3schools.com/python/python_regex.asp).

Back in those days, computer science (compsci) and mathematics were kept apart at the high school level, to a degree we might today find dumbfounding.

Regular Expression syntax was actually invented by Stephen Cole Kleene, [a mathematician](https://en.wikipedia.org/wiki/Stephen_Cole_Kleene), who was in turn a student of [Alonzo Church](https://en.wikipedia.org/wiki/Alonzo_Church) of [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus) ($\lambda$ calc) fame.

In [13]:
import re

Let's start with [`findall`](https://pynative.com/python-regex-findall-finditer/) (link).

Find all four-digit strings. ```\d``` stands for "any digit".

In [66]:
pattern = r"\d\d\d\d"  # raw string, prefix r
target  = "these four digit sequences: 8974 and 1059, will be matched.  Cool!"
re.findall(pattern, target)

['8974', '1059']

Another way to say it:

In [67]:
re.findall(r"\d{4}", "$8945 is a pretty good price for that speed boat, better than $9910")

['8945', '9910']

Any number of digits (in a row):

In [70]:
target = """GB 18030 is a Chinese government standard, 
described as Information Technology — Chinese coded character 
set and defines the required language and character support 
necessary for software in China. GB18030 is the registered 
Internet name for the official character set of the People's 
Republic of China (PRC) superseding GB2312.[1] 
As a Unicode Transformation Format[a] (i.e. an encoding of 
all Unicode code points), GB18030 supports both simplified 
and traditional Chinese characters. It is also compatible 
with legacy encodings including GB2312, CP936,[b] and GBK 1.0. 
"""
# https://en.wikipedia.org/wiki/GB_18030

re.findall(r"\d+", target)

['18030', '18030', '2312', '1', '18030', '2312', '936', '1', '0']

Between four and six digits:

In [55]:
re.findall(r"\d{4,6}", "seeking 12, 134, 8974, 10599 and 2")

['8974', '10599']

The Regular Expression language is complex and yet ubiquitous enough to merit study.  

You may use regexes not only to find, but to find and replace:

In [64]:
re.sub("\d\d\d\d", "xxxx", "Your Visa card number is 9845-9049-7754-0011")

'Your Visa card number is xxxx-xxxx-xxxx-xxxx'

And to split apart:

In [61]:
re.split("\s+", "splittling on  a word   boundary")

['splittling', 'on', 'a', 'word', 'boundary']

Including regexes in the study of mathematics expands one's innate sense of what mathematics contains, preparing the ground for future logic and/or typesetting and/or markup notations.  

To some extent, mathematics is a practice of cultivating fluency and proficiency with various symbol systems.  Does chess, with its corresponding symbol language, count as a mathematical?  Perhaps.  

A more restrictive definition might insist that mathematical languages concern themselves with proofs. Regular Expressions would not qualify as mathematical either, in that case.