# Artificial Intelligence - Laboratory 02:Python Introduction part II


##### Review

The following formula computes a _Z score_ and measures how far a single raw data value is from the population mean.

\begin{equation*}
z = \frac{X - \mu }{\sigma }
\end{equation*}

where:
* **_X_** is a single raw data value
* `mu` is the population mean
* `sigma` is the population standard deviation

To find the standard deviation, the equation below comes in hand:

\begin{equation*}
\sigma = \sqrt{\frac{\sum \left | X - \mu \right |^{2}}{N}}
\end{equation*}

where **_N_** is the number of data points in the population.

**a.** Using `sum()` and `list comprehension`, compute the mean and the standard deviation for the population defined below:

In [2]:
import math
data =  [4.5, 5, 5.5, 6, 6.25, 7, 15.25, 18, 18.45, 21, 21.45, 23]
print(data)


[4.5, 5, 5.5, 6, 6.25, 7, 15.25, 18, 18.45, 21, 21.45, 23]


In [3]:
# Your implementation here:
mean = sum(data)/len(data)
std = math.sqrt(sum([(x-mean)**2 for x in data])/len(data))

print(mean)
print(std)


12.616666666666667
7.167441818544622


**b.** Define the `z_score()` function and implement the mathematical expression. The obtained values should be stored in a _z score_ values list and rounded to 3 decimals.

In [4]:
# Your implementatio here:
def z_score(x, mu, sigma):
    if sigma == 0:
        raise ValueError("Standard deviation is zero; z-scores undefined.")
    return (x - mu) / sigma

z_scores = [round(z_score(x, mean, std), 3) for x in data]
z_scores


[-1.132,
 -1.063,
 -0.993,
 -0.923,
 -0.888,
 -0.784,
 0.367,
 0.751,
 0.814,
 1.17,
 1.232,
 1.449]

**c.** Add the corresponding elongation of each raw data value into a dictionary.

In [5]:
# Your implementatio here:
elongation = {x: round(abs(z), 3) for x, z in zip(data, z_scores)}
elongation


{4.5: 1.132,
 5: 1.063,
 5.5: 0.993,
 6: 0.923,
 6.25: 0.888,
 7: 0.784,
 15.25: 0.367,
 18: 0.751,
 18.45: 0.814,
 21: 1.17,
 21.45: 1.232,
 23: 1.449}

## Classes

The object-oriented programming paradigm in Python helps with structuring programs into `individual objects`. But how?

* An Object **O** from a class **C** has a set of properties **_p_** and actions **_a_**.

* The functions of a class are called `methods`. Their responsibility is to model the data corresponding to a given object.

* The objects of a class are known as `instances` and represent the source of collecting data.

```python

class EmptyClas:
    """
    This is a class without variables and methods
    """
    pass # The keyword pass is a placeholder


class MyClass:
    # A class variable
    name = 'My Class'
    
    def my_method(self, my_var):
        # An instance variable
        self.my_instance = my_var
```

In [6]:
# Implement Task 0 b and c here:

class ScientificConference:
    """
    To define the properties of a class, 
    we use a special method called __init__.
    
    The special variable called "self"
    helps with associating the attributes
    w\\ the new object: similar to `this`
    keyword from other programming languages
    and required to address variables from
    classes. 
    """
    def __init__(self, name, year, papers):
        """
        Establish the attributes of the
        class and assign values to the 
        corresponding parameters.
        """ 
        self.name = name
        self.year = year
        """
        b. Add new attribute `papers`
        """
        if (papers is None):
            self.papers = {}
        else:
            # handle duplicate entries by removing them (per researcher)
            # preserve order using dict.fromkeys
            self.papers = {
                author: list(dict.fromkeys(titles))
                for author, titles in papers.items()
            }


    
    def add_manuscript(self, title, researcher):
        # c. Add (researcher -> [titles]) and avoid duplicates
        if researcher in self.papers.keys():
            if title not in self.papers[researcher]:
                self.papers[researcher].append(title)
        else:
            self.papers[researcher] = [title]

    def __str__(self):
        """
        To return the String representation of
        an object, we use the __str__ method. 
        """
        result = self.name + ' ' + str(self.year) + ': \n'
        for author, papers in self.papers.items():
            result += f'{author}: {", ".join([str(paper) for paper in papers])} \n'
        return result


### Task 0

**a.** Define two new `instances` of the `class ScientificConference` and return their representations.

Your output should look like:

`Proposals for ICML and NeurIPS conferences will be accepted until the end of November 2021.`

_Hint:_ `instance.attribute` helps you extracting a certain property.

In [7]:
sc  = ScientificConference("UPB AI Conf.", 2000, {"nlp": ["word2vec"], "gen ai": ["gpt"]})
sc2 = ScientificConference("UBB AI conf.", 2000, {"gpt": ["transformers"], "face recognition": ["eigenfaces"]})

print(sc)
print(sc2)


UPB AI Conf. 2000: 
nlp: word2vec 
gen ai: gpt 

UBB AI conf. 2000: 
gpt: transformers 
face recognition: eigenfaces 



**b.** Create a new attribute for the `class ScientificConference`, which is a dictionary passed as a parameter to the instances of the class and holds all of the papers of the conference.

_Note:_ You should check if `papers` is `None` in `__init__` and set it to `{}` instead.

_Please handle duplicate entries by removing them!_

**c.** Define the `add_manuscript` method which generates new entries in the dictionary described before. Please consider using the _researcher_ as a `key` and the _title_ as `values`.

In [8]:

sc.add_manuscript("markov chains", "mario")
sc.add_manuscript("markov chains 2", "mario")
sc.add_manuscript("markov chains", "mario")  
print(sc)


UPB AI Conf. 2000: 
nlp: word2vec 
gen ai: gpt 
mario: markov chains, markov chains 2 



### Task 1

**a.** Define the class `Person` which stores the `title`, `name` and `surname` of a person.

The _tuple_ `allowed_titles` is a class variable which helps to verify if the title of a person is "Mr", "Mrs", "Ms", "Senior Researcher", "Professor of CS" or "Computer Scientist".

An error is returned if the title is not valid.

Use `__str__` defined below:

```python
    def __str__(self):
        return self.title + ' ' + self.surname + ' ' + self.name
```

In [9]:
class Person:
    # class variable with the only valid titles
    allowed_titles = ("Mr", "Mrs", "Ms", "Senior Researcher", "Professor of CS", "Computer Scientist")

    def __init__(self, title, name, surname):
        title = title.strip()
        if title not in Person.allowed_titles:
            raise ValueError("The title isn't right")
        self.title = title
        self.name = name.strip()
        self.surname = surname.strip()

    def __str__(self):
        return self.title + ' ' + self.surname + ' ' + self.name

**b.** Create two instances of the class Person and verify if the following entries are valid:

* _Mr Ian Goodfellow_,
* _SeniorResearcher Tomas Mikolov._

In [10]:
# valid
p1 = Person("Mr", "Ian", "Goodfellow")
print(p1)  # Mr Goodfellow Ian

# invalid (no space in "SeniorResearcher")
try:
    p2 = Person("SeniorResearcher", "Tomas", "Mikolov")
    print(p2)
except ValueError as e:
    print("Invalid entry:", e)

# valid version
p2_ok = Person("Senior Researcher", "Tomas", "Mikolov")
print(p2_ok)  # Senior Researcher Mikolov Tomas

Mr Goodfellow Ian
Invalid entry: The title isn't right
Senior Researcher Mikolov Tomas


### Task 2

In `ScientificConference` we have been using the paper parameter as a string, but this concept requires a detailed structure.

Introduce a new class, `Paper`, which has the following attributes:

* `authors`, 
* `title`, 
* `a_id`,
* `year`, 
* `status` (published or in development), 
* `peer_rating` (Excellent, Good, Fair, Poor, Barely Acceptable, Unacceptable).

In [11]:
class Paper:
    def __init__(self, authors, title, a_id, status, year, peer_rating):
        if not title or not title.strip():
            raise ValueError("The title isn't right")
        self.authors = authors
        self.title = title
        self.a_id = a_id
        self.status = status
        self.year = year
        self.peer_rating = peer_rating

    def __str__(self):
        return  f'{self.title}, {", ".join([author for author in self.authors])} et al. ({self.year}), a_id: '\
                f'{self.a_id}, status: {self.status}, rating: {self.peer_rating}'

## Inheritence

In Object-Oriented programming, this concept enables us to transfer the methods and the properties of a class to another class.

### Task 3

Create a class named `Researcher`, which inherits the properties and methods from the `Person` class. Besides, this class has an additional parameter, `papers` which is `None` by default.

_Note:_ You should check if `papers` is `None` in `__init__` and set it to `[]` instead.

```python
class Researcher(Person):
    def __init__('Add arguments'):
        super().__init__(title, name, surname)
```

In [12]:


class Researcher(Person):
    def __init__(self, title, name, surname, papers=None):
        if papers is None:
            papers = []
        super().__init__(title or "Mr", name, surname)
        self.papers = list(dict.fromkeys(papers))  # dedupe, preserve order
        self.co_authored = False

    def verify_co_authorship(self, other):
        """Set and return whether self ever co-authored a paper with `other`."""
        shared_ids = {p.a_id for p in self.papers} & {p.a_id for p in other.papers}
        self.co_authored = len(shared_ids) > 0
        return self.co_authored

    def get_collab(self, other):
        """Return list of papers co-authored by self and other."""
        other_ids = {p.a_id for p in other.papers}
        return [p for p in self.papers if p.a_id in other_ids]



### Task 4

Consider the following scientists:

1.  Paper _Deep Learning_ published by Yann LeCun, Yoshua Bengio, Geoffrey Hinton, in _nature 521_, id = https://doi.org/10.1038/nature14539, peer_rating = Excelent.

2. Paper _On the difficulty of training recurrent neural networks_ by Razvan Pascanu, Tomas Mikolov, Professor of computer science Yoshua Bengio, in ICML 2013, id = https://arxiv.org/abs/1211.5063, peer_rating = Excelent.

2. Paper _Generative Adversarial Nets_ by Ian Goodfellow and Yoshua Bengio, NeurIPS 2015, id = http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf, peer_rating = Excelent.

3. Paper _Handwritten Digit Recognition with a Back-Propagation Network_ by Computer Scientist Yann LeCun, NeurIPS 1989, id =  https://papers.nips.cc/paper/293-handwritten-digit-recognition-with-a-back-propagation-network, peer_rating = Excelent.

4. Paper _Gated Softmax Classification_ by Geoffrey Hintorn, NeurIPS 2010, id = http://papers.neurips.cc/paper/3895-gated-softmax-classification, peer_rating = Good.

_Note:_ Let us consider "Mr" as a default title for the researchers without a specific caption. Also, for the id of a paper, please use only integers from the provided links.

**a.** Define the next 5 scientists and use them in your `paper` objects.

**b.** Create the `verify_co_authorship` function inside the `class Researcher` which checks if a certain researcher ever co-authored a paper.
_Hint:_ Use `self.co_authored = False` inside the `__init__` function.

**c.** Implement the `get_collab` function inside the `class Researcher` to discover the papers written by two researchers.

For instance, if Yoshua Bengio is researcher2 and Ian Goodfellow is researcher3, then:

`print_papers(researcher2.get_collab(researcher3))` should output:

_Generative Adversarial Nets, Mr Ian Goodfellow et al. (2015), a_id: 5423, status: published, rating: Excelent_

_Note:_ This function helps you to print the papers from a given list.

```python
def print_papers(paper_list):
    for paper in paper_list:
        print(paper)
```

**d.** What are the papers written by Yoshua Bengio?

Expected output:

`Deep Learning, Computer Scientist Yann LeCun et al. (2015), a_id: 14539, status: published, rating: Excelent`

`Generative Adversarial Nets, Mr Ian Goodfellow et al. (2015), a_id: 5423, status: published, rating: Excelent`

`Paper On the difficulty of training recurrent neural networks, Mr Razvan Pascanu et al. (2013), a_id: 5063, status: published, rating: Excelent`

**e.** Did he ever co-author a paper?

**f.** Which papers are published by Yann LeCun?

Expected output:

`Deep Learning, Computer Scientist Yann LeCun et al. (2015), a_id: 14539, status: published, rating: Excelent`

`Handwritten Digit Recognition with a Back-Propagation Network, Computer Scientist Yann LeCun et al. (1989), a_id: 293, status: published, rating: Good`

In [13]:
researchers = [
    Researcher("Mr", "Yann", "LeCun"),
    Researcher("Mr", "Yoshua", "Bengio"),
    Researcher("Mr", "Geoffrey", "Hinton"),
    Researcher("Mr", "Ian", "Goodfellow"),
    Researcher("Mr", "Tomas", "Mikolov"),
]

papers = [
    Paper(authors=["Yann LeCun", "Yoshua Bengio", "Geoffrey Hinton"], title="Deep Learning", a_id="14539", status="published", year=2015, peer_rating="Excellent"),
    Paper(authors=["Razvan Pascanu", "Tomas Mikolov", "Yoshua Bengio"], title="On the difficulty of training recurrent neural networks", a_id="5063", status="published", year=2013, peer_rating="Excellent"),
    Paper(authors=["Ian Goodfellow", "Yoshua Bengio"], title="Generative Adversarial Nets", a_id="5423", status="published", year=2015, peer_rating="Excellent"),
    Paper(authors=["Yann LeCun"], title="Handwritten Digit Recognition with a Back-Propagation Network", a_id="293", status="published", year=1989, peer_rating="Good"),
    Paper(authors=["Geoffrey Hinton"], title="Gated Softmax Classification", a_id="3895", status="published", year=2010, peer_rating="Good"),
]


researchers[0].papers.extend([papers[0], papers[3]])  
researchers[1].papers.extend([papers[0], papers[1], papers[2]])  
researchers[2].papers.append(papers[0]) 
researchers[3].papers.append(papers[2])  
researchers[4].papers.append(papers[1])  


yoshua_papers = [str(paper) for paper in researchers[1].papers]

co_author_status = researchers[1].verify_co_authorship(researchers[3])

yann_lecun_papers = [str(paper) for paper in researchers[0].papers]

yoshua_papers, co_author_status, yann_lecun_papers

(['Deep Learning, Yann LeCun, Yoshua Bengio, Geoffrey Hinton et al. (2015), a_id: 14539, status: published, rating: Excellent',
  'On the difficulty of training recurrent neural networks, Razvan Pascanu, Tomas Mikolov, Yoshua Bengio et al. (2013), a_id: 5063, status: published, rating: Excellent',
  'Generative Adversarial Nets, Ian Goodfellow, Yoshua Bengio et al. (2015), a_id: 5423, status: published, rating: Excellent'],
 True,
 ['Deep Learning, Yann LeCun, Yoshua Bengio, Geoffrey Hinton et al. (2015), a_id: 14539, status: published, rating: Excellent',
  'Handwritten Digit Recognition with a Back-Propagation Network, Yann LeCun et al. (1989), a_id: 293, status: published, rating: Good'])

### Task 5 

Consider an updated version of the `ScientificConference` class, which should have a modified version of the function `add_manuscript`.

Use the `status` and the `peer_rating` variables as a **threshold** to add papers in your `papers` dictionary. The conferences will only be accepting `Excelent` papers. For this case, the dictionary has the year of the paper as `key`, and the `values` are stored as a tuple of `(researcher, manuscript)`. For the papers which don't satisfy this condition, the message _"Please review your submission."_ is displayed.

For papers submitted in 2015, when printing the conference, the `str` function should output:

```
NeurIPS 2020: 
2015: 
Mr Ian Goodfellow: Generative Adversarial Nets, Mr Ian Goodfellow et al. (2015), id: 5423, status: published, rating: Excelent 
Computer Scientist Yann LeCun: Deep Learning, Computer Scientist Yann LeCun et al. (2015), id: 14539, status: published, rating: Excelent
```

In [None]:
class ScientificConferenceUpdate:
    """
    To define the properties of a class, 
    we use a special method called __init__.
    
    The special variable called "self"
    helps with associating the attributes
    w\ the new object: similar to `this`
    keyword from other programming languages
    and required to address variables from
    classes. 
    """
    def __init__(self, name, year, papers=None):
        """
        Establish the attributes of the
        class and assign values to the 
        corresponding parameters.
        """ 
        self.name = name
        self.year = year
        """
        Add new attribute `papers`
        """
        # dict: year -> list of (researcher, manuscript)
        if papers is None:
            self.papers = {}
        else:
            # ensure structure and remove duplicates per (researcher,title)
            cleaned = {}
            for y, items in papers.items():
                seen = set()
                lst = []
                for (author, paper) in items:
                    key = (author, paper.title.strip().lower())
                    if key not in seen:
                        seen.add(key)
                        lst.append((author, paper))
                cleaned[y] = lst
            self.papers = cleaned
    
    def add_manuscript(self, manuscript: Paper, researcher):
      
        # Accept only published + Excellent papers
        status = manuscript.status
        rating = str(manuscript.peer_rating)
        # accept "Excellent" regardless of case and tolerate the misspelling "Excelent"
        is_excellent = rating.lower().startswith("excelent") or rating.lower().startswith("excellent")
        if status == "published" and is_excellent:
            y = manuscript.year
            if y not in self.papers:
                self.papers[y] = []
            # avoid duplicates: same researcher & same title within that year
            title_norm = manuscript.title.strip().lower()
            if not any(a == researcher and p.title.strip().lower() == title_norm for (a, p) in self.papers[y]):
                self.papers[y].append((researcher, manuscript))
        else:
            print("Please review your submission.")
        
    def __str__(self):
        """
        To return the String representation of
        an object, we use the __str__ method. 
        """
        result = self.name + ' ' + str(self.year) + ': \n'
        for year, papers in self.papers.items():
            result += f'{year}: \n'
            for (author, paper) in papers: 
                result += f'{author}: {paper} \n'
        return result


conf = ScientificConferenceUpdate("NeurIPS", 2020)# no manuscripts (empty section)




conf.add_manuscript(papers[2],  "Mr Ian Goodfellow")# Generative Adversarial Nets (2015, Excellent) -> accept
conf.add_manuscript(papers[0], "Computer Scientist Yann LeCun")# Deep Learning (2015, Excellent) -> accept
conf.add_manuscript(papers[4], "Mr Geoffrey Hinton")# Gated Softmax (2010, Good) -> reject
conf.add_manuscript(papers[1],  "Mr Razvan Pascanu")# On the difficulty... (2013, Excellent) -> accept (adds 2013 section)
conf.add_manuscript(papers[2],  "Mr Ian Goodfellow")# duplicate -> ignored

print(conf)


Please review your submission.
NeurIPS 2020: 
2015: 
Mr Ian Goodfellow: Generative Adversarial Nets, Ian Goodfellow, Yoshua Bengio et al. (2015), a_id: 5423, status: published, rating: Excellent 
Computer Scientist Yann LeCun: Deep Learning, Yann LeCun, Yoshua Bengio, Geoffrey Hinton et al. (2015), a_id: 14539, status: published, rating: Excellent 
2013: 
Mr Razvan Pascanu: On the difficulty of training recurrent neural networks, Razvan Pascanu, Tomas Mikolov, Yoshua Bengio et al. (2013), a_id: 5063, status: published, rating: Excellent 



  """
