<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/ufidon/nlp/blob/main/01.re.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
  <td>
    <a target="_blank" href="https://kaggle.com/kernels/welcome?src=https://github.com/ufidon/nlp/blob/main/01.re.ipynb"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" /></a>
  </td>
</table>
<br>

# Regular Expressions

📝 SALP chapter 2

## 🍎 An Intriguing Example

How do we read and comprehend the text below?
- parse sentences, words
- search for patterns
- recognize name entities
- find the meaning of words in their context
- feel the sentiment, etc.

In [3]:
text = """
John Smith, 123 Main St, Anytown USA 12345
Phone: (555) 123-4567
Email: [john.smith@example.com](mailto:john.smith@example.com)
Occupation: Software Engineer

Jane Doe, 456 Elm St, Othertown USA 67890
Phone: 1-800-789-0123
Email: janedoe@gmail.com
Occupation: Marketing Manager
"""

We can find the following information from this text:

* Names (first and last)
* Addresses
* Phone numbers
* Email addresses
* Occupations

We learn these information subconsciously through **nlp procedures and concepts**.

## Text Normalization

**Definition:** The process of transforming text into a standard format to prepare it for further processing.

**Examples:**
- Converting all text to lowercase
- Removing punctuation
- Expanding contractions (e.g., "don't" to "do not")



## Tokenizing / Tokenization

**Definition:** The process of breaking down text into smaller units called tokens, typically words or subwords.

**Example:**
Input: "The quick brown fox jumps over the lazy dog."
Output: ["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog", "."]



## Emoticons

**Definition:** Textual representations of facial expressions using punctuation and letters.

**Examples:**
- :) (smile)
- :( (sad)
- ;) (wink)
- :-O (surprised)



## Hashtags

**Definition:** Words or phrases preceded by a hash sign (#) used to categorize content on social media platforms.

**Examples:**
- #NaturalLanguageProcessing
- #AI
- #MachineLearning
- #DataScience



## Lemmatization

**Definition:** The process of reducing words to their base or dictionary form (lemma), considering the context and part of speech.

**Examples:**
- "running" → "run"
- "better" → "good"
- "mice" → "mouse"



## Lemmatizer

**Definition:** A tool or algorithm that performs lemmatization.

**Example:**
Using NLTK's WordNetLemmatizer:


In [None]:
# Check if NLTK is installed on Google Colab
import sys
in_colab = 'google.colab' in sys.modules
nltk_installed = 'nltk' in sys.modules

if in_colab and not nltk_installed:
    print("NLTK is not installed. Installing now...")
    %pip install nltk

In [3]:
import nltk
nltk.download('wordnet')

from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()
print(lemmatizer.lemmatize("running", pos="v"))  # Output: "run"

[nltk_data] Downloading package wordnet to /home/shanren/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


run




## Stemming

**Definition:** The process of reducing words to their root form by removing affixes, often using heuristic rules.

**Examples:**
- "running" → "run"
- "happiness" → "happi"
- "convertible" → "convert"



## Sentence Segmentation

**Definition:** The process of dividing text into individual sentences.

**Example:**
Input: "Mr. Smith bought a new car. It was very expensive. He loves it!"
Output: 
1. "Mr. Smith bought a new car."
2. "It was very expensive."
3. "He loves it!"



## Edit Distance

**Definition:** A measure of the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word into another.

**Example:**
Edit distance between "kitten" and "sitting":
1. kitten → sitten (substitution of "s" for "k")
2. sitten → sittin (substitution of "i" for "e")
3. sittin → sitting (insertion of "g" at the end)

Edit distance: 3

## Implementation in Python regular expressions

In [2]:
import re

# Define regex patterns for each piece of information
name_pattern = r"[A-Za-z]+ [A-Za-z]+"
address_pattern = r"\d+ [A-Za-z]+ St, [A-Za-z]+ USA \d{5}"
phone_pattern = r"\(\d{3}\) \d{3}-\d{4}|\d-\d{3}-\d{4}"
email_pattern = r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
occupation_pattern = r"Software Engineer|Marketing Manager"

# Use regex to find all occurrences of each pattern
names = re.findall(name_pattern, text)
addresses = re.findall(address_pattern, text)
phones = re.findall(phone_pattern, text)
emails = re.findall(email_pattern, text)
occupations = re.findall(occupation_pattern, text)

# Print the extracted information
print("Names:")
for name in names:
    print(name)

print("\nAddresses:")
for address in addresses:
    print(address)

print("\nPhone Numbers:")
for phone in phones:
    print(phone)

print("\nEmail Addresses:")
for email in emails:
    print(email)

print("\nOccupations:")
for occupation in occupations:
    print(occupation)


Names:
John Smith
Main St
Anytown USA
Software Engineer
Jane Doe
Elm St
Othertown USA
Marketing Manager

Addresses:
123 Main St, Anytown USA 12345
456 Elm St, Othertown USA 67890

Phone Numbers:
(555) 123-4567
0-789-0123

Email Addresses:
john.smith@example.com
john.smith@example.com
janedoe@gmail.com

Occupations:
Software Engineer
Marketing Manager


The regex features used:

* Character classes (`[A-Za-z]+`, `\d+`)
* Word boundaries (`\b`)
* Groups (`(\d{3})`)
* Alternation (`|`)
* Quantifiers (`*`, `+`, `{5}`)
* Anchors (`^`, `$`)


## Introduction to Regular Expressions (re)
* Regular expressions (regex) are a powerful tool for matching patterns in text data.
* In NLP, regex is used for tasks such as:
	+ Text preprocessing
	+ Information extraction
	+ Sentiment analysis

## 🏃 [reg101](https://regex101.com/)

## Basic Concepts
* **Pattern**: A regular expression is a pattern that matches one or more strings of text.
* **Literal characters**: Characters that match themselves (e.g. `a` matches the letter "a").
* **Metacharacters**: Special characters that have special meanings (e.g. `.` matches any single 
character).
* **Escaping**: Using a backslash (`\`) to treat metacharacters as literal characters.
- **Corpus**: A large collection of text where regular expressions are applied for pattern matching.

**Example:**
- Pattern: `\bcat\b`
- Corpus: "The cat sat on the mat."
- Match: "cat"

| **Special Character** | **Description**                                                    | **Escape Sequence**   | **Example**                        |
|-----------------------|--------------------------------------------------------------------|-----------------------|------------------------------------|
| `.`                   | Matches any character except newline.                              | `\.`                  | `a\.b` matches `a.b`               |
| `^`                   | Matches the start of a string.                                     | `\^`                  | `\^abc` matches `^abc`             |
| `$`                   | Matches the end of a string.                                       | `\$`                  | `abc\$` matches `abc$`             |
| `*`                   | Matches 0 or more repetitions of the preceding element.            | `\*`                  | `a\*b` matches `a*b`               |
| `+`                   | Matches 1 or more repetitions of the preceding element.            | `\+`                  | `a\+b` matches `a+b`               |
| `?`                   | Matches 0 or 1 repetition of the preceding element.                | `\?`                  | `a\?b` matches `a?b`               |
| `{}`                  | Matches a specified number of repetitions of the preceding element.| `\{ \}`              | `a\{2\}` matches `a{2}`            |
| `[]`                  | Denotes a character class.                                         | `\[\]`                | `\[\]` matches `[]`                |
| `()`                  | Denotes a group or captures the matched content.                   | `\(\)`                | `a\(\)` matches `a()`              |
| `\|`                   | Acts as an OR operator.                                            | `\\|`                  | `a\\|b` matches `a\|b`               |
| `\`                   | Escapes a special character.                                       | `\\`                  | `\\` matches `\`                   |
| `/`                   | Delimits a regular expression pattern in some languages.           | `\/`                  | `\/` matches `/`                   |
| `-`                   | Indicates a range in a character class.                            | `\-`                  | `[a\-z]` matches `a-z`             |
| `:`                   | Used in some special sequences (e.g., POSIX).                      | `\:`                  | `\:` matches `:`                   |
| `!`                   | Used for negation in some contexts (e.g., negative lookahead).     | `\!`                  | `\!` matches `!`                   |
| `"`                   | Used in some contexts, e.g., within JSON.                          | `\"`                  | `\"` matches `"`                   |
| `'`                   | Used in some contexts, e.g., within JSON.                          | `\'`                  | `\'` matches `'`                   |
| `#`                   | Used in some languages as a comment character.                     | `\#`                  | `\#` matches `#`                   |
| `<`                   | Used in lookahead and lookbehind assertions.                       | `\<`                  | `\<` matches `<`                   |
| `>`                   | Used in lookahead and lookbehind assertions.                       | `\>`                  | `\>` matches `>`                   |


## Concatenation, Kleene Star, and Kleene Plus

**Concatenation**: Combining two or more patterns in sequence.
- Example: `a` + `b` matches "ab".

**Kleene Star (`*`)**: Matches zero or more occurrences of the preceding element.
- Example: `a*` matches "", "a", "aa", etc.

**Kleene Plus (`+`)**: Matches one or more occurrences of the preceding element.
- Example: `a+` matches "a", "aa", "aaa", etc.

## Disjunction, Character Class, and Range

**Disjunction (`|`)**: Matches either pattern on its left or right.
- Example: `cat|dog` matches "cat" or "dog".

**Character Class**: Matches any one character within a defined set.
- Example: `[abc]` matches "a", "b", or "c".

**Range**: Shorthand notation for specifying a range of characters.
- Example: `[a-z]` matches any lowercase letter from 'a' to 'z'.
- `[A-Z]` matches any uppercase letter from 'A' to 'Z'.
- `[0-1]` matches any digit from '0' to '9'.

| **Special Range** | **Description**                                                                                  | **Example**                      |
|-------------------|--------------------------------------------------------------------------------------------------|----------------------------------|
| `[a-z]`           | Matches any lowercase letter from a to z.                                                        | `b`, `m`, `z`                    |
| `[A-Z]`           | Matches any uppercase letter from A to Z.                                                        | `B`, `M`, `Z`                    |
| `[0-9]`           | Matches any digit from 0 to 9.                                                                   | `0`, `5`, `9`                    |
| `[a-zA-Z]`        | Matches any letter, whether uppercase or lowercase.                                              | `a`, `Z`                         |
| `[a-zA-Z0-9]`     | Matches any alphanumeric character (letters and digits).                                         | `b`, `7`, `Q`                    |
| `[aeiou]`         | Matches any vowel.                                                                               | `a`, `e`, `i`                    |
| `[^a-z]`          | Matches any character that is not a lowercase letter.                                            | `A`, `7`, `@`                    |
| `[\w]`            | Matches any word character (equivalent to `[a-zA-Z0-9_]`).                                       | `a`, `5`, `_`                    |
| `[\W]`            | Matches any non-word character (equivalent to `[^a-zA-Z0-9_]`).                                  | `@`, `#`, `!`                    |
| `[\d]`            | Matches any digit (equivalent to `[0-9]`).                                                       | `2`, `9`                         |
| `[\D]`            | Matches any non-digit (equivalent to `[^0-9]`).                                                  | `a`, `Q`, `!`                    |
| `[\s]`            | Matches any whitespace character (spaces, tabs, line breaks).                                    | ` `, `\t`, `\n`                  |
| `[\S]`            | Matches any non-whitespace character (equivalent to `[^ \t\n\r\f\v]`).                           | `A`, `9`, `@`                    |

## Anchors

**Definition:**
- **Anchors**: Special characters that match positions within the text rather than actual characters.

**Examples:**
- `^`: Matches the start of a string.
- `$`: Matches the end of a string.
- `\b`: Matches a word boundary.

**Example:**
- Pattern: `^cat`
- Corpus: "cat is cute."
- Match: "cat"

## Grouping, Precedence, and Disjunction

**Grouping (`()`)**: Groups patterns and controls operator precedence.
- Example: `(cat|dog)s` matches "cats" or "dogs".

**Precedence**: Determines the order in which regular expression operators are evaluated.

**Disjunction (`|`)**: Matches either of the patterns in the group.
- Example: `cat|dog` matches "cat" or "dog".

## re Operators and Precedence

**Operators in Order of Precedence**:
1. `()` - Grouping
2. `[]` - Character Class
3. `*`, `+`, `?` - Quantifiers
4. `^`, `$`, `\b` - Anchors
5. `|` - Disjunction

**Example:**
- Pattern: `a(bc|de)f`
- Matches: "abcf" or "adef"

## Quantifiers

**Definition**: Specifies the number of occurrences of a character or group.

**Examples:**
- `*`: 0 or more occurrences
- `+`: 1 or more occurrences
- `?`: 0 or 1 occurrence
- `{n}`: Exactly n occurrences
- `{n,}`: n or more occurrences
- `{n,m}`: Between n and m occurrences

## Greedy and Nongreedy Matching

**Greedy Matching**: Attempts to match the longest possible string.
- Example: `a.*b` matches "aabcdb" in "aabcdb".

**Nongreedy Matching**: Attempts to match the shortest possible string.
- Example: `a.*?b` matches "aab" in "aabcdb".

## Substitution, Capture Groups, and Non-Capturing Groups

**Substitution (`re.sub`)**: Replaces matched patterns with a specified replacement.

**Capture Groups**: Use `()` to capture a part of the match for later use.
- Example: `(\w+)` captures a word.

**Non-Capturing Groups `(?:...)`**: Groups without capturing.
- Example: `(?:cat|dog)s` matches "cats" or "dogs" without capturing "cat" or "dog".

**Example:**
- Pattern: `(\w+)`
- Replacement: `\1-\1`
- Corpus: "cat"
- Result: "cat-cat"

## Regular Expression Substitution

```python
re.sub(pattern, replacement, string)
```

- **`pattern`**: The regular expression that defines the text to be matched.
- **`replacement`**: The string to replace the matched text.
- **`string`**: The input string where the substitution will occur.

#### Example 1: Basic Substitution
**Goal**: Replace all occurrences of the word "cat" with "dog".

In [2]:
import re

text = "The cat sat on the cat mat."
result = re.sub(r'cat', 'dog', text)

print(result)

The dog sat on the dog mat.


#### Example 2: Substitution with Capture Groups
**Goal**: Switch the first and last names in a list of names.

In [3]:
import re

text = "John Doe, Jane Smith"
pattern = r'(\w+) (\w+)'
replacement = r'\2 \1'

result = re.sub(pattern, replacement, text)

print(result)

Doe John, Smith Jane


#### Example 3: Using Backreferences in Substitution
**Goal**: Surround each word with parentheses.

In [4]:
import re

text = "cat dog"
pattern = r'(\w+)'
replacement = r'(\1)'

result = re.sub(pattern, replacement, text)

print(result)

(cat) (dog)


#### Example 4: Substitution with Function
**Goal**: Replace each word with its uppercase version.

In [5]:
import re

text = "cat dog"
pattern = r'(\w+)'

def to_upper(match):
    return match.group(1).upper()

result = re.sub(pattern, to_upper, text)

print(result)

CAT DOG


#### Example 5: Limiting the Number of Substitutions
**Goal**: Replace only the first occurrence of "cat" with "dog".

In [6]:
import re

text = "cat cat cat"
result = re.sub(r'cat', 'dog', text, count=1)

print(result)

dog cat cat


#### Example 6: Non-Capturing Groups in Substitution
**Goal**: Replace "Mr." and "Ms." with "Mx."

In [7]:
import re

text = "Mr. John Doe, Ms. Jane Doe"
pattern = r'(?:Mr|Ms)\.'
replacement = r'Mx.'

result = re.sub(pattern, replacement, text)

print(result)

Mx. John Doe, Mx. Jane Doe


## Lookahead Assertions

**Lookahead**: Asserts that a pattern follows the current position without consuming characters.

**Positive Lookahead (`(?=...)`)**:
- Example: `\w+(?=\d)` matches a word followed by a digit.

**Negative Lookahead (`(?!...)`)**:
- Example: `\w+(?!\d)` matches a word not followed by a digit.

## False Positives and False Negatives in Matching, Precision and Recall

**False Positives**: Matches that are incorrect.
**False Negatives**: Failing to match a correct pattern.

**Precision**: Ratio of true positives to all matches.
**Recall**: Ratio of true positives to all actual positives.

**Example:**
- Pattern: `cat`
- Corpus: "The catalog is here."
- False Positive: "catalog"

## Comprehensive Example

**Pattern**: `^(Mr|Ms|Dr)\.?\s(\w+)\s(\w+)$`
**Explanation**:
- `^` asserts the start of the string.
- `(Mr|Ms|Dr)` matches "Mr", "Ms", or "Dr".
- `\.?` optionally matches a period.
- `\s` matches a whitespace character.
- `(\w+)` captures the first and last name.

**Example Corpus**: "Dr. John Doe"
**Match**: "Dr. John Doe"

In [1]:
import re

# Comprehensive pattern
pattern = r"^(Mr|Ms|Dr)\.?\s(\w+)\s(\w+)$"

# Sample text
text = "Dr. John Doe"

# Match
match = re.match(pattern, text)

if match:
    title = match.group(1)
    first_name = match.group(2)
    last_name = match.group(3)
    print(f"Title: {title}, First Name: {first_name}, Last Name: {last_name}")
else:
    print("No match found.")

Title: Dr, First Name: John, Last Name: Doe


🔗 [re — Regular expression operations](https://docs.python.org/3/library/re.html)

## 🏃 Practice
- Play common Python regexes on [reg101](https://regex101.com/)

## 🍎 Define a pattern to match the word "good" through refinement


### Step 1: Basic Matching

```regex
good
```

- It matches the exact sequence of characters "good" in the text.

### Step 2: Word Boundaries
- Avoid matching words like "goodbye" or "goodness".

```regex
\bgood\b
```

### Step 3: Case Insensitivity
- Allow match good with any sensitivity

```regex
(?i)\bgood\b
```

- `(?i)` is a case-insensitive flag, making the match case-insensitive.

### Step 4: Handling Optional Characters
- Handle cases where there might be optional characters like punctuation or a trailing "s", such as "good!" or "goods".

```regex
(?i)\bgood\b[^\w]?
```

- `[^\w]?` matches an optional non-word character (like punctuation) after "good".

### Step 5: Match Variations (Optional)
- Expand the pattern to match variations or synonyms of "good" if needed.

```regex
(?i)\b(good|great|excellent)\b
```


### Step 6: Full Refinement
- Handle standalone word 

```regex
(?i)\bgood\b([^\w]|$)
```

### Example Use
Here’s how the refined pattern works with various inputs:

- **Input:** "Good job!"
  - **Matches:** "Good"
  
- **Input:** "The goods are here."
  - **Doesn't Match:** "goods" (since it's part of another word)

- **Input:** "She is a good person."
  - **Matches:** "good"

- **Input:** "GOOD."
  - **Matches:** "GOOD"

## Using Regex in NLP
* **Preprocessing text data**: Use regex to remove punctuation, convert to lowercase, etc.
* **Extracting information**: Use regex to extract specific patterns from text data (e.g. phone 
numbers, email addresses).
* **Sentiment analysis**: Use regex to extract sentiment-bearing phrases from text data.

## 🍎 Example
Text Analysis with NLTK:

* Tokenize the text into individual words and sentences
* Perform stemming on the tokens (i.e., reduce words to their base form)
* Identify named entities in the text (e.g., people, places, organizations)

In [4]:
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer
from nltk.chunk import ne_chunk

# Download required NLTK data if necessary
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('words')
nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng')
nltk.download('maxent_ne_chunker_tab')

# use the same text as above
# text = """
# The quick brown fox jumps over the lazy dog. The sun is shining brightly today.
# """

# Tokenize the text into individual words and sentences
word_tokens = word_tokenize(text)
sentence_tokens = sent_tokenize(text)

print("Word Tokens:")
for token in word_tokens:
    print(token)

print("\nSentence Tokens:")
for sentence in sentence_tokens:
    print(sentence)

# Perform stemming on the tokens
stemmer = PorterStemmer()
stemmed_words = [stemmer.stem(word) for word in word_tokens]

print("\nStemmed Words:")
for stemmed_word in stemmed_words:
    print(stemmed_word)

# Identify named entities in the text
tagged_text = nltk.pos_tag(word_tokenize(text))
named_entities = ne_chunk(tagged_text)

print("\nNamed Entities:")
for tree in named_entities:
    if hasattr(tree, 'label'):
        print(tree.label(), end=': ')
        for leaf in tree.leaves():
            print(leaf[0], end=' ')
        print()

[nltk_data] Downloading package punkt to /home/qingshan/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/qingshan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package words to /home/qingshan/nltk_data...
[nltk_data]   Package words is already up-to-date!
[nltk_data] Downloading package punkt_tab to
[nltk_data]     /home/qingshan/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /home/qingshan/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package maxent_ne_chunker_tab to
[nltk_data]     /home/qingshan/nltk_data...
[nltk_data]   Package maxent_ne_chunker_tab is already up-to-date!


Word Tokens:
John
Smith
,
123
Main
St
,
Anytown
USA
12345
Phone
:
(
555
)
123-4567
Email
:
[
john.smith
@
example.com
]
(
mailto
:
john.smith
@
example.com
)
Occupation
:
Software
Engineer
Jane
Doe
,
456
Elm
St
,
Othertown
USA
67890
Phone
:
1-800-789-0123
Email
:
janedoe
@
gmail.com
Occupation
:
Marketing
Manager

Sentence Tokens:

John Smith, 123 Main St, Anytown USA 12345
Phone: (555) 123-4567
Email: [john.smith@example.com](mailto:john.smith@example.com)
Occupation: Software Engineer

Jane Doe, 456 Elm St, Othertown USA 67890
Phone: 1-800-789-0123
Email: janedoe@gmail.com
Occupation: Marketing Manager

Stemmed Words:
john
smith
,
123
main
st
,
anytown
usa
12345
phone
:
(
555
)
123-4567
email
:
[
john.smith
@
example.com
]
(
mailto
:
john.smith
@
example.com
)
occup
:
softwar
engin
jane
doe
,
456
elm
st
,
othertown
usa
67890
phone
:
1-800-789-0123
email
:
janedo
@
gmail.com
occup
:
market
manag

Named Entities:
PERSON: John 
GPE: Smith 
PERSON: Anytown 
PERSON: Software Engineer Jane

## NLTK features used:

* Tokenization (`word_tokenize`, `sent_tokenize`)
* Stemming (`PorterStemmer`)
* Part-of-speech tagging (`pos_tag`)
* Named entity recognition (`ne_chunk`)

🔗 [Natural Language Toolkit](https://www.nltk.org/)

## 🍎 Application: ELIZA Chatbot

### What is ELIZA?
- ELIZA: Early Language Intelligent System Attempt
- A pioneering chatbot in artificial intelligence
- An early natural language processing computer program
- Created by Joseph Weizenbaum at MIT from 1964 to 1966
- One of the first chatbots in the history of artificial intelligence


### How ELIZA Works

1. Uses pattern matching and substitution methodology
2. Simulates conversation by using pre-programmed responses
3. Aims to engage users in a manner similar to a Rogerian psychotherapist


### Key Features

- Keyword identification
- Contextual pattern matching
- Transformation rules to convert input to output
- Ability to maintain a conversational state


### Historical Significance

- Demonstrated the potential of human-computer interaction
- Sparked discussions about AI and its implications
- Influenced development of subsequent chatbots and conversational AI


### Legacy and Impact

- Raised questions about the nature of intelligence and understanding
- Contributed to ongoing debates in AI ethics and philosophy
- Inspired further research in natural language processing and AI

### 🍎 [A simple ELIZA](https://stackoverflow.com/questions/54777612/regex-python-rule-based-eliza-implementation)
- [code](./codes/01/seliza.py)

🔗 [Eliza chatbot in Python](https://github.com/wadetb/eliza)