# <b>Questionnaire</b>

1. Provide an example of where the bear classification model might work poorly in production, due to structural or style differences in the training data.

> Deep learning algorithms are generally not good at recognizing images that are significantly different in structure or style to those used to train the model. For instance, if there were no black-and-white images in the training data, the model may do poorly on black-and-white images. Similarly, if the training data did not contain hand-drawn images, then the model will probably do poorly on hand-drawn images.
> * Bear is not the only thing in the picture and it is not the main one or too small.
> * Obstacle over the bear.
> * Bear is in the night.
> * The bear training dataset is highly biased towards one type of features.

2. Where do text models currently have a major deficiency?


> * Computers are very good at classifying both short and long documents based on categories such as spam or not spam, sentiment (e.g., is the review positive or negative), author, source website, and so forth. It's good at making this content compelling to humans too. </br>
> * Deep learning is currently not good at generating correct responses! 

3. What are possible negative societal implications of text generation models?

> Misinformations and halucianations can cause problems, when people will believe it and spread it.

4. In situations where a model might make mistakes, and those mistakes could be harmful, what is a good alternative to automating a process?

> Experts in the field will evaluate the results and determine what is the best next step. For medical diangnosis it is usually the way. Radiologist can get alarms when machine learning model spots something dangerous and they can prioritize these CT scans over those who were good.a

5. What kind of tabular data is deep learning particularly good at?


> Deep learning does greatly increase the variety of columns that you can include—for example, columns containing natural language (book titles, reviews, etc.), and high-cardinality categorical columns (i.e., something that contains a large number of discrete choices, such as zip code or product ID). 

6. What's a key downside of directly using a deep learning model for recommendation systems?

> It can recommend you the things you have already seen or you would like to see anyway, so it will not recommend you something new and helpful to you.


7. What are the steps of the Drivetrain approach?

> The basic idea is to start with considering your objective, then think about what actions you can take to meet that objective and what data you have (or can acquire) that can help, and then build a model that you can use to determine the best actions to take to get the best results in terms of your objective.


> ![Alt text](blog/drivetrain-approach.png)

8. How do the steps of the Drivetrain Approach map to a recommendation system?

> * The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with recommendations of items they would not have purchased without the recommendation.
> * The lever is the ranking of the recommendations.
> * New data must be collected to generate recommendations that will cause new sales. (require conducting many randomized experiments in order to collect data )
> * Build two models for purchase probabilities, conditional on seeing or not seeing a recommendation

9. Create an image recognition model using data you curate, and deploy it on the web.

> https://huggingface.co/spaces/DagmarC/lesson2

10. What is DataLoaders?

> DataLoaders is a thin class that just stores whatever DataLoader objects you pass to it, and makes them available as train and valid.
> It is a high-level abstraction that handles the process of loading and managing datasets for training machine learning models.

In [None]:
class DataLoaders(GetAttr):
    def __init__(self, *loaders): self.loaders = loaders
    def __getitem__(self, i): return self.loaders[i]
    train,valid = add_props(lambda i,self: self[i])

> * Data Handling: Efficiently loads data from datasets and provides it to the model in batches.
> * Data Augmentation: Supports data augmentation techniques that can be applied to the training data on-the-fly.
> * Transforms: Allows for easy application of transformations to data, such as normalization, cropping, resizing, etc.
> * Batching: Handles the creation of batches from the dataset, which helps in optimizing the training process.
> * Shuffling: Supports shuffling of data to ensure that the model does not learn the order of the training data.
> * Parallel Loading: Uses multiple processes to load data in parallel, reducing the time required to fetch data during training.

11. What four things do we need to tell fastai to create DataLoaders?

> 1. What kinds of data we are working with
> 2. How to get the list of items
> 3. How to label these items
> 4. How to create the validation set

In [None]:
trees = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128))

12. What does the splitter parameter to DataBlock do?

> Splitter parameter splits the training and the validation set 'randomly'.

13. How do we ensure a random split always gives the same validation set?


> The same training/validation split each time we run this notebook, so we <b>fix the random seed</b> (computers don't really know how to create random numbers at all, but simply create lists of numbers that look random; if you provide the same starting point for that list each time—called the seed—then you will get the exact same list each time)

14. What letters are often used to signify the independent and dependent variables?

>  x is independent. y is dependent.
>
> The independent variable is the thing we are using to make predictions from, and the dependent variable is our target.
> 
> e.g. Independent: images, Dependent: Categories

15. What’s the difference between crop, pad, and squish Resize() approaches? When might you choose one over the other?


> * crop is the default Resize() method, and it crops the images to fit a square shape of the size requested, using the full width or height. This can result in losing some important details. For instance, if we were trying to recognize the breed of dog or cat, we may end up cropping out a key part of the body or the face necessary to distinguish between similar breeds.
> * pad is an alternative Resize() method, which pads the matrix of the image’s pixels with zeros (which shows as black when viewing the images). If we pad the images then we have a whole lot of empty space, which is just wasted computation for our model, and results in a lower effective resolution for the part of the image we actually use.
> * squish is another alternative Resize() method, which can either squish or stretch the image. This can cause the image to take on an unrealistic shape, leading to a model that learns that things look different to how they actually are, which we would expect to result in lower accuracy.
> 
> * Another better method is RandomResizedCrop, in which we crop on a randomly selected region of the image. So every epoch, the model will see a different part of the image and will learn accordingly.

16. What is data augmentation? Why is it needed?

> Creates variations of the input data, so they appear differrent but the meaning states the same.
> 
> Image data augmentation: rotations, flippingm perspective warping, brigtness and contrast changes.
> 
> aug_transforms func in fastai
> 
> When imgs are of the same size, we can apply these transforms to the entire batch using the GPU (saves a lot of time)
> 
> To tell fastai we want to use these transforms on a batch, we use the batch_tfms.

17. What is the difference between item_tfms and batch_tfms?

In [None]:
bears = bears.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls = bears.dataloaders(path)
dls.train.show_batch(max_n=8, nrows=2, unique=True)

> * <b>item_tfms</b> are transformations applied to a single data sample x on the CPU. Resize() is a common transform because the mini-batch of input images to a cnn must have the same dimensions. Assuming the images are RGB with 3 channels, then Resize() as item_tfms will make sure the images have the same width and height.

> * <b>batch_tfms</b> are applied to batched data samples (aka individual samples that have been collated into a mini-batch) on the GPU. They are faster and more efficient than item_tfms. A good example of these are the ones provided by aug_transforms(). Inside are several batch-level augmentations that help many models.

18. What is a confusion matrix?

![image.png](attachment:c61f02e4-c1f1-4282-bdba-4c6904a35148.png)

> It is the structure that visualize the predictions made vs the correct labels. It can help us to understand the what the model has problem.
> * Rows represents dataset input - actual labels.
> * Columns represent what the model predicted - predictions.
> * Diagonal - correct answers model predicted.
> * Elswhere - incorrect.

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

19. What does export save do?

> saves the model - architecture together with the trained parameters of the neural network into the *.pkl format. It also saves how the Dataloaders is defined.

20. What is it called when we use a model for getting predictions, instead of training?

> Inference

21. What are IPython widgets?


> IPython widgets are GUI components that bring together JavaScript and Python functionality in a web browser, and can be created and used within a Jupyter notebook. For instance, the image cleaner that we saw earlier in this chapter is entirely written with IPython widgets. However, we don't want to require users of our application to run Jupyter themselves.

22. When might you want to use CPU for deployment? When might GPU be better?

> GPUs are best for doing <b>identical work in parallel</b>. If you will be analyzing single pieces of data at a time (like a single image or single sentence), then CPUs may be more cost effective instead, especially with more market competition for CPU servers versus GPU servers. GPUs could be used if you collect user responses into a batch at a time, and perform inference on the batch. This may require the user to wait for model predictions. Additionally, there are many other complexities when it comes to GPU inference, like memory management and queuing of the batches.


23. What are the downsides of deploying your app to a server, instead of to a client (or edge) device such as a phone or PC?

> Your data are exposed to the internet.
> 
> Data privacy and integrity.
> 
> Network connection needed.
> 
> Higher latency of actions.

24. What are 3 examples of problems that could occur when rolling out a bear warning system in practice?

> It can send the police to the places where the probability is higher, it wil really check the bear was there and after some time it will forget on some other nonfrequent places, model will be biased towards those where the probability is higer.
> 
> It can hnadle night-time images badly.
> 
> Prediction is too slow.

What is “out of domain data”?

> Data that is fundamentally different in some aspect compared to the model’s training data. For example, an object detector that was trained exclusively with outside daytime photos is given a photo taken at night.

What is “domain shift”?

> This is when the type of <b>data changes gradually over time</b>. For example, an insurance company is using a deep learning model as part of their pricing algorithm, but over time their customers will be different, with the original training data not being representative of current data, and the deep learning model being applied on effectively out-of-domain data.

What are the 3 steps in the deployment process?


> ### * Step 1: Export the Trained Model
> ** Save the trained model using learn.export('model.pkl').

In [None]:
# Train your model
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)

# Export the model
learn.export('model.pkl')

> ### * Step 2: Set Up the Deployment Environment
> Create an application using a framework like FastAPI, Gradio that loads the model and handles prediction requests.

> For example: create space on Gradio, have git installed, create app.py, download model.pkl and place it, where your app.py refers to it


> ### * Step 3: Build the app ready for deployment
> Deploy the application to a server or cloud platform and make it accessible for inference requests.

> In gradio, when you push your work app.py, the build process starts automatically.


28. For a project you’re interested in applying deep learning to, consider the thought experiment “what would happen if it went really, really well?”

> To be done by reader.