# Case study:

UML- Unified modeling language (UML) to help depict and summarize the software we're going to build.

Logical View:
* Ther are there classes that define our core data model, plus some uses of the generic list class.
* Here are the four central classes:
    1. The TrainingData class is a container with two lists of data samples, a list used for training our model and a list used for testing our model. both lists are composed of known sample instances.
    Two lists of data samples:
        1. list use for training our model.
        2. list used for testing our model.
    Both lists are composed of known sample instances.

    2. Each instance of the sample class is the core piece of working data. In our example, these are measurements of sepal lengths and widths and petal lengths and widths. 

    3. A knownSample object is an extended Sample. A knownSample is a sample with one extra attribute, the assigned species. This information comes from skilled botanists who have classified some data we can use for trainign and testing.
    
    4. The Hyperparameter class has the k used to define how many of the nearest neighbors to consider. It also has a summary of testing with this value of k. The quality tells us how many of the test samples were correctly classified. We expect to see that small values of k (like 1 to 3) don't classify well. We expect middle vlaues of k to do better, and very large values of k to not do as well.

# Samples and their States:

The diagram in figure 2.2 shows the sample class and an extension, the known sample class. This doesn't seem to be a complete decomposition of the various kinds of samples.
- When we review the user stories and the process views, there seems to be a gap: specifically, the make classificaiton request bya  user requires a unknonwn sample. 
- this has the same flower measurements attributes as a sample, but doesnt have assigned species attributes of a known sample. Further more there is no state change that adds an attribute value. 
- The unknown sample will never be formally classified bya botanist; it will be classified by our algorithm, but it's only an AI, not a botanist.

We can make a case for two distinct subclasses of sample:
* Unknown sample: This class contains the initial four sample attributes. A user provides these objects to get them classified.
* Known Sample: This class has the sample attriutes plus the classification result, a species name. We use these for training and testing the model.

Generally, we consider class definitions as a way to encapsulate state and behavior. An unknown sample instance provided by a user starts out with no species. Then, after the classifier algorithm computes a species, the sample changes state to have a species assigned by the algorithm.

* A queston we must always ask about class definition is this:
- Is there any change in behavior that goes with the change in state.
    - Is there a change in behavior that goes with change in state.
    - in this case, it doesn't seem like there's anything new or differnt that can happen. Perhaps this can be implemented as a single class with some optional attributes.

    - We have another possible state change concern. Currently, there's no class that owns the responsibility of partitioning sample objects into training or testing subsets. This too is a kind of state change.

    This leads to second question:
* What class has responsibility for making this state change?
    - In this case, it seems like the TrainingData class should own the discrimination between testing and training data.
    - One way to help look closely at our class design is to enumerate all of the various states of individual samples.
    - This technique helps uncover a need for attributes in the classes. 
    - It also helps to identify the methods to make state changes to objects of a class.





    Additionally we'll also have a list of alternative hyperparameter vlaues. In general, these are tuning values that change the behavior of the model.
# Attribute Def:
A piece of information which determines the properties of a field or tag in a database or a string of characters in a display.

# What is Hyperparameter tunning?
* A hyperparameter is a configuraiton variable that data scientists set before training a machine learning model.
* Hyperparameters are different from parameters, which are internal parameters that are automatically derived during the learning process.
* hyperparameter are important because they can significantly impact the performance and generalization ability of machine learning model. 
* Finding the right combination of hyperparameters is essential to coaxing the best performance from both supervised learning and unsupervised learning models.
* Examples of hyperparameters include: The number of nodes and layers in neural network, the number of branches in decision tree, learning rate, batch size, and Epochs.

In General these are tuning values that change the behavior of the model. The idea is to test with different hyperparameters to locate the highest quality model.

# What is metadata?
Meta data is information that describes other data, or data about data. It can be applied to many things, including:
* Books, which can include the author, title, publisher, edition and more
* computer files, which can include the file size, file extension, and when the file was created.
* Images, which include time and geolocation of the photo.
* Web pages, which can include meta tags that describe the content, such as the title, author, and keywords.

We 've allocated a little bit of meta data to this class:
* the name of the data we're working with, 
* the datetime of when we uploaded the data the first time, 
* and the datetime of when we ran a test against the model.





# Sample State Transition

Let's look at the life cycles of sample objects. An object's lifecycle starts with object creation then state changes, and in some cases the end of its processing life when there are no more references to it. We have three scenarios.

* An object's lifecycle starts with object creation then state changes, and in some cases the end of its processing life when there is no more reference to it.

1. Initial Load: we'll need a load() method to populate a TrainingData object from some source of raw data.(e.g., reading a csv file often produces a sequence of dictionaries)
- we can imagine a load() method using a csv reader to create sample objects with a species value, making them KnownSample objects. The load() method splits the knownSample objects into the training and testing lists, which is an important state change or a TrainingData object.

2. Hyperparameter testing: We'll need a test() method in the Hyperparameter class.  The body of the test() method works with the test samples in the associated TrainingData object. For each sample, it applies the classifier and counts the matches between botanist assigned species and the best guess of our AI algorithm. This points out the need for a classify() method for a single sample that's used by thetest() method for a batch of samples.
    - the test() method will update the state of the hyperparameter object by setting the quality score.

3. User initated classification: A restful web application is often decomposed into separate view functions to handle requests. When handling a request to classify an unknown sample, the view function will have a Hyperparameter object used for classification; this will be choosen by Botanist to produce the best results. 
- The user input will be an unknwonSample instance. The view function applies the hyperparameter.classif() method to create a response to the user with the species the iris has been classified as. Does the state change that happens when the AI classifies an unknown sample really matter? Here are the two views?

* Each unknown sample can have a classified attribute. Setting this is a change in the state of the sample. It's not clear that there's any behavior change associated with this state change.

* The classification result is not part of the result at all. It's a local variable in the view function. It's a local variable in the view function. This state change in the function is used to respond to the user, but has no life within the sample object.

Some design decisions are based on non funcitonal and non technical considerations. These might include the longevity of the application, future use cases, additional users who might be enticed, current schedules and budgets, pedagogical value, technical risk, the creation of intellectual property, and how cool the demo will look in a conference call.






In [2]:
# An implementation almost always needs to have an __init__() method.
# Ther's another special method that can really help: The __repr__() method is used to create a representation of the object.
# __repr__() method is used to createa representation of the object. The representations ia string that generally has the syntax of a python expression to rebuild the object.

"""
The representation is a string that generally has the syntax of a python expression to rebuild the object. For simple numbers, it's the number.

For a simple string, it will include the quotes.  for complex objects, it will have all the necessary 
python punctuation, including all the details of the class and the state of the object.mro

* We will often use an f-string with the class name and the attribute values.

Here's the start of a class, Sample, which seems to capture all the features of a single sample:

"""
from typing import Optional

class Sample:

    def __init__(
        self, 
        sepal_length: float,
        sepal_width: float,
        petal_length: float,
        petal_width: float,
        species: Optional[str] = None,

    ) -> None:

        self.sepal_length = sepal_length
        self.sepal_width = sepal_width
        self.petal_length = petal_length
        self.petal_width = petal_width
        self.species = species
        self.classificaiton: Optional[str] = None
    
    def __repr__(self) -> str:
        if self.species is None:
            known_unknown = "UnknownSample"
        else:
            known_unknown = "knownSample"
        
        if self.classification is None:
            classification = ""

        else:
            classification = f", {self.classification}"
        
        return (
            f"{known_unknown}("
            f"sepal_length={self.sepal_length}, "
            f"sepal_width={self.sepal_width}, "
            f"petal_length = {self.petal_length}, "
            f"petal_width = {self.petal_width}, "
            f"species={self.species!r}"
            f"{classification}"
            f")"
        )

# __repr__ method
* The __repr__ method reflects the fairly complex internal state of this sample object. The states implied the presence or absence of a species and the presence or absence of a classification lead to small behavior changes.
- the states implied by the presence or absence of a species and the presence or absence of classification lead to small behavior changes. So far, any changes in object behavior are limited to the __repr__() method used to display the current state of the object.

* What's important is that state change do lead to tiny behavior changes.
* We have two application specific methods for the sample class. These are shown in the next code snippet



In [None]:
def classify(self, classification:str) -> None:
    self.classification = classification

def matches(self) -> bool:
    return self.species == self.classification



#Note on above definitions

The classify() method defines the state change from unclassified to classified. The matches() method compares the results of classification with a Botanist assigned species. This is used for testing.


# Class Responsibilities

* Which class is responsible for actually performing a test?
* does the trainign class invoke the classifier on each known sample in a testing set?
* or perhaps, does it provide the testing set to the hyperparameter class, delegating the testing to the hyperparameter class?
since the hyperparameter class has the responsibility for the k value, and the algorithm for loacting the k nearest neighbors, it seems sensible for the hyperparameter class to run the test using its own k vlaue and a list of known sampl instance provided to it.

- Since the hyperparameter class has the responsibility for the k value, and the algorithm for locating the nearest neighbors, it seems sensible for the hyperparameter class to run the test using its own k value and a list of knwon sample instance provided to it.

It seems sensible for the hyperparameter class to run the test using its own k value and a list of knownSample instances provided to it.

- it also seems clear the training data class is an acceptable place to record the various hyperparameter trials. This means the trainingData class can identify which of the hyperparameter instances have a value of k that classifies irises with the highest accuracy.

- there are multiple, related staet changes here. In this case, both the hyperparameter and training data classes will do part of the work. The sytem -  as a whole - will state as individual elements change state.
- There are muiltiple, related state changes here. In this case, both hyperaparameter class and training data classes will do part of the work. 
- The system as a whole will change state as individual elements change state. This is sometimes described as emergent behavior. Rather than writing a monster class that does many things we've written smaller classes that collaborate to achieve the xpected goals.

-  This test() method of TrainingDta is something thaht we didn't show in the UML image. We included test() in the Hyperparameter class, but at the time, it didn't seem necessary to add it to the training data.

Here is the start of the class definition.





In [None]:
class Hypeerparameter:
    """
    A Hyperparameter value and the overall quality of the classification
    - A hyperparameter value and the overall quality of the classification.
    """

    def __init__(self, k: int, training:"TrainingData") -> None:
        self.k = k
        self.data: weakref.ReferenceType["TrainingData"] = weakref.ref(training)
        self.quality = float


# Weakref in python
in python, the weakref module allows you to create weak refernce to objects. A weak refernce to an object does not prevent the object from being garbage collected.

In Python, the weakref module allows you to create weak references to objects. A weak reference to an object does not prevent the object from being garbage collected. 
Why use weak references?
Avoiding memory leaks:
Weak references are useful in situations where you want to hold a reference to an object, but you don't want that reference to prevent the object from being garbage collected if it is no longer needed. This can be useful in caching scenarios, for example.
Handling circular references:
Python's garbage collector can handle cycles of strong references, but it can be more efficient to use weak references to break those cycles.
How to use weak references:
Python


import weakref

class MyClass:
    def __init__(self, name):
        self.name = name

    def __del__(self):
        print(f"Object {self.name} is being deleted.")

obj = MyClass("MyObject")
weak_obj = weakref.ref(obj)

# Access the object through the weak reference
if weak_obj():
    print(weak_obj().name)

# Delete the object and see if the weak reference is still valid
del obj
if weak_obj():
    print(weak_obj().name)  # This will not print anything
else:
    print("Object has been deleted.")
Important points:
weakref.ref: This function creates a weak reference to an object.
weakref.proxy: This function creates a proxy object that behaves like the original object, but if the original object is garbage collected, the proxy will raise a ReferenceError.
Callback function: You can pass a callback function to weakref.ref that will be called when the object is about to be garbage collected.
weakref.WeakKeyDictionary and weakref.WeakValueDictionary: These dictionaries hold weak references to their keys or values, respectively. This prevents the dictionaries from keeping objects alive if they are no longer referenced elsewhere.
weakref.WeakSet: This set holds weak references to its elements.