<a href="https://colab.research.google.com/github/lianacdubs/python/blob/main/Winter_2024_Discussion_Notebook_7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ECS32A Discussion Notebook 7 Mar 11 - Mar 15

During the last week of discussion sections we will review regular expressions and classes.

 1. Regular Expressions
 2. Defining a new data type with a class definition
 3. Constructing a new object with the __init__ method
 4. Writing class methods
 5. Manipulating object properties


# Introduction to Regular Expressions in Python : Regex

Regular expressions, or regex, are a sequence of characters that define some form of a search pattern.
Regular expressions, or regex, are a very powerful tool for searching and replacing text within a string.
It is a way to find, format and manipulate text in any programming language.

The following is some syntax related to regex :

```text
/  — delimiter (start and end of regex)
?  — match 0 or 1 time
*  - match 0 or more times
+  — match 1 or more times
[] — range of acceptable values
{} — exactly n characters
|  — create different branches
() — grouping
i  — case insensitive
^  — anchor to the beginning of the string
$  — anchor to the end of the string
```

# The Python "re" module provides regular expression support.

Regular expressions are a powerful language for matching text patterns. This notebook gives a basic introduction to regular expressions, and shows how regular expressions work in Python.

In Python a regular expression search is typically written as:

  match = re.search(pat, str)
  
The re.search() method takes a regular expression pattern and a string and searches for that pattern within the string. If the search is successful, search() returns a match object or None otherwise. Therefore, the search is usually immediately followed by an if-statement to test if the search succeeded, as shown in the following example which searches for the pattern 'word:' followed by a 3 letter word (details below):

# Searching for Patterns in Text


One of the most common uses for the re module is for finding patterns in text. Let's do a quick example of using the search method in the re module to find some text:

## Example 1: Search the string to see if it starts with "ECS 32A" and ends with "UC Davis":

In [None]:
import re

txt = "ECS 32A is an introduction to programming course in UC Davis"
match = re.search("^ECS 32A.*UC Davis$", txt)

if match:
  print("YES! We have a match!")
else:
  print("No match")

YES! We have a match!


## Basic Patterns

The power of regular expressions is that they can specify patterns, not just fixed characters. Here are the most basic patterns which match single chars:

The code match = re.search(pat, str) stores the search result in a variable named "match". Then the if-statement tests the match -- if true the search succeeded. Otherwise if the match is false (None to be more specific), then the search did not succeed, and there is no matching text.

Some example of basic patterns :

```text
a, X, 9, < -- ordinary characters just match themselves exactly. The meta-characters which do not match themselves because they have special meanings are: . ^ $ * + ? { [ ] \ | ( ) (details below)

(a period) -- matches any single character except newline '\n'

\w -- (lowercase w) matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_]. Note that although "word" is the mnemonic for this, it only matches a single word char, not a whole word. \W (upper case W) matches any non-word character.

\b -- boundary between word and non-word

\s -- (lowercase s) matches a single whitespace character -- space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S) matches any non-whitespace character.

\t, \n, \r -- tab, newline, return

\d -- decimal digit [0-9] (some older regex utilities do not support but \d, but they all support \w and \s)
^ = start, $ = end -- match the start or end of the string

\ -- inhibit the "specialness" of a character. So, for example, use \. to match a period or \\ to match a slash. If you are unsure if a character has special meaning, such as '@', you can put a slash in front of it, \@, to make sure it is treated just as a character.
```


## Example 2: Return the domain type of given email ids

Suppose you are given sample@ucdavis.edu, our task is to extract the domain i.e ucdavis using findall function

This is a common problem in many web applications where a website like facebook takes input of emails from users.

In [None]:
result=re.findall(r'@\w+','sample@ucdavis.edu,abc.test@gmail.com, xyz@test.in, test.first@analyticsvidhya.com, first.test@rest.biz')
print(result)

['@ucdavis', '@gmail', '@test', '@analyticsvidhya', '@rest']


Above, you can see that “.com”, “.in” part is not extracted. To do that, try the code below.

In [None]:
result=re.findall(r'@\w+.\w+','sample@ucdavis.edu, abc.test@gmail.com, xyz@test.in, test.first@analyticsvidhya.com, first.test@rest.biz')
print(result)

['@ucdavis.edu', '@gmail.com', '@test.in', '@analyticsvidhya.com', '@rest.biz']


Now we have the exact domain from all the emails

## Example 3: Extracting emails from a Text Document

A lot of times, the sales and marketing teams might require finding/extracting emails and other contact information from large text documents.

Now, this can be a cumbersome task if you are trying to do it manually! This is exactly the kind of situations when Regex is really useful.

In [None]:
import re

text = "I'm Adesh, new@gmail.com, I'm Lisa, lisa@gmail.com"

re.findall(r"[\w.-]+@[\w.-]+", text)

['new@gmail.com', 'lisa@gmail.com']

Now you can see all the emails are extracted and can be stored in a list.

# Classes

## A simple Car class

Below is a very simple example of a class as a real world object. We will use it as an opportunity to discuss the definition of a class and the creation of objects.

The following link shows the code in Python tutor:

https://tinyurl.com/ycfq6oev

* To create a class, use the keyword ```class```, then we can use the class named ```Car``` to create objects.
* All classes have a function called ```__init__()```, which is always executed when the class is being initiated. Use the ```__init__()``` function to assign values to object properties, or other operations that are necessary to do when the object is being created.
* Methods in objects are functions that belong to the object.
* The ```self``` parameter is a reference to the current instance of the class, and is used to access variables that belongs to the class.

In [None]:
# A car class
class Car:

    # Construct a new Car object with 0 miles
    def __init__(self):
        self.mileage = 0            # mileage driven
        print("New car constructed")

    # Drive a car method
    def drive(self, miles):
        # Drive the car for 'miles' miles
        self.mileage = self.mileage + miles

# Create a new Car object and put it in the variable car1
car1 = Car()

# Create a new Car object and put it in the variable car2
car2 = Car()

# Classes are just data types!
# What data type is car1?
print(type(car1))


# Are car1 and car2 the same? Are they equal?
print("car1 == car2: ", car1 == car2)
print("car1.mileage == car2.milage", car1.mileage == car2.mileage)

# Drive car1 20 miles
car1.drive(20)


# Drive car2 30 miles
car2.drive(20)


# Now are car1 and car2 the same? Are they equal?
print("car1 == car2: ", car1 == car2)
print("car1.mileage == car2.milage", car1.mileage == car2.mileage)


New car constructed
New car constructed

Type of car1: <class '__main__.Car'>

car1 == car2:   False
car1.mileage == car2.mileage:   True

Car mileage: 20
Car mileage: 30

car1 == car2:   False
car1.mileage == car2.mileage:   False


## Magic 8 Ball

The following class defines a Magic 8 Ball game it is very similar to the class from your homework assignment. It contains a single list datastructure that keeps track of all the possible answers in the game and has methods for accessing the datastructure.

* Attributes: the variables that are within each ball object.
    * count: Number of plays.
    * answers: a list containing the possible answers.
    * answer_count: answer counts dictionary


* Methods: the functions that are within each ```Magic8Ball``` object.
    * ```__init__(self)```: construct and initialize the object.
    * ```add_answer(self, ans)```: add an answer to list of answers.
    * ```num_ans(self)```: return the number of possible answers.
    * ```get_answer_list(self)```: return a nicely sorted list of all the answers.
    * ```get_count(self)```: return the number of games played.
    * ```play(self)```: play the game

In [None]:
import random

class Magic8Ball:
    # Constructor method
    # The constructor method creates a new magic 8 ball
    def __init__(self):
        # Number of plays
        self.count = 0
        # List of potential answers
        self.answers = []
        # Dictionary of Answer counts
        self.answer_count = {}
        # Add answers from configuration file
        infile = open("magic_answers.txt")
        for line in infile:
            line = line.strip()
            self.add_answer(line)  #Each line of the file is added as a Key to the dictionary and corresponding Value is initialised to 0
        infile.close()



    #EXERCISE: Adding answers using the add_answer method
    # Add a new answer to the Magic-8-Ball
    def add_answer(self, ans):
        self.answer_count[ans] = 0
        self.answers.append(ans)




    # Report on the number of possible answers
    def num_ans(self):
        #Length of the list answers
        return len(self.answers)



    #EXERCISE: Counting Answers
    # Return a nicely sorted list of all the answers
    def get_answer_list(self):
        #Sorting the dictionary based on the values
        return sorted(self.answer_count.items(), key=lambda x: x[1], reverse=True)



    #EXERCISE: Counting Plays
    # Return the number of games played
    def get_count(self):
        return self.count


    # Play the game
    def play(self):
        print("Magic-8-Ball")
        print("Shake. Shake. Shake.")
        while True:
            key = input("Press enter to exit.")
            if key == "":
                break
            ans = self.answers[random.randint(0, len(self.answers) - 1)]
            self.answer_count[ans] += 1
            print(ans)
        self.count += 1



def main():
    # Create Magic 8 Ball
    ball =  Magic8Ball()      #ball is instance of the class Magic8Ball()

    # Print the number of answers
    print("Number of answers in game:",ball.num_ans())

    # Play the game
    ball.play()
    ball.play()
    ball.add_answer("New answer")
    print("Games played:", ball.get_count())
    print("Top answers:", ball.get_answer_list())

main()

Number of answers in game: 21
Magic-8-Ball
Shake. Shake. Shake.
Concentrate and ask again.
Outlook not so good.
Yes.
Magic-8-Ball
Shake. Shake. Shake.
Ask again later.
Games played: 2
Top answers: [('Yes.', 1), ('Ask again later.', 1), ('Concentrate and ask again.', 1), ('Outlook not so good.', 1), ('It is certain.', 0), ('It is decidedly so.', 0), ('Without a doubt.', 0), ('Yes – definitely.', 0), ('You may rely on it.', 0), ('As I see it, yes.', 0), ('Most likely.', 0), ('Outlook good.', 0), ('No.', 0), ('Signs point to yes.', 0), ('Reply hazy, try again.', 0), ('Better not tell you now.', 0), ('Cannot predict now.', 0), ("Don't count on it.", 0), ('My reply is no.', 0), ('My sources say no.', 0), ('Very doubtful.', 0), ('New answer', 0)]


## Exercise: Adding answers using the add_answer method

Additional answers can be added to a Magic 8 Ball object using the add_answer() method. Add the following code to the main() function to add additional answers to the Magic 8 Ball to make it more positive. Print the number of answers and sorted answer list  after that to make sure the change was made.

Could all the answers from the file magic_answers.txt loaded in the ball at this point instead of in the ```__init__``` method? Pros or cons?


In [None]:
# Make the game more positive
    for ans in ["Yeah!","Affirmative.","I think yes.","I think so.","Yes!","Yes!!"]:
        ball.add_answer(ans)

#### Solution

In [None]:
def add_answer(self, ans):
    self.answer_count[ans] = 0
    self.answers.append(ans)

## Exercise: Counting Plays

As an exercise, extend the program above to count the number of times the game is played. Add an attribute count that keeps track of the count and a method get_count() that returns the number of games played.

#### Solution

In [None]:
def get_count(self):
    return self.count

## Exercise: Counting Answers

As an exercise, add a dictionary that counts the number of times each answer is given. Extend the method get_answer_list() so that it prints a list of answers sorted first by the number of times they have been used. Ties in the sort order should be broken alphabetically.


#### Solution

In [None]:
# Return a nicely sorted list of all the answers
def get_answer_list(self):
     return sorted(self.answer_count.items(), key=lambda x: x[1], reverse=True)
