# Qwiklabs Assessment: Automate updating catalog information
### Introduction

You work for an online fruits store, and you need to develop a system that will update the catalog information with data provided by your suppliers. The suppliers send the data as large images with an associated description of the products in two files (.TIF for the image and .txt for the description). The images need to be converted to smaller jpeg images and the text needs to be turned into an HTML file that shows the image and the product description. The contents of the HTML file need to be uploaded to a web service that is already running using Django. You also need to gather the name and weight of all fruits from the .txt files and use a Python request to upload it to your Django server.

You will create a Python script that will process the images and descriptions and then update your company's online website to add the new products.

Once the task is complete, the supplier should be notified with an email that indicates the total weight of fruit (in lbs) that were uploaded. The email should have a PDF attached with the name of the fruit and its total weight (in lbs). 

Finally, in parallel to the automation running, we want to check the health of the system and send an email if something goes wrong. 
What you’ll do

    Write a script that summarizes and processes sales data into different categories 

    Generate a PDF using Python

    Automatically send a PDF by email 

    Write a script to check the health status of the system 

You'll have 120 minutes to complete this lab.

# Project Problem Statement

Okay, here's the scenario:

You work for an online fruit store, and you need to develop a system that will update the catalog information with data provided by your suppliers. When each supplier has new products for your store, they give you an image and a description of each product.

Given a bunch of images and descriptions of each of the new products, you’ll:

    Upload the new products to your online store. Images and descriptions should be uploaded separately, using two different web endpoints.

    Send a report back to the supplier, letting them know what you imported.

Since this process is key to your business's success, you need to make sure that it keeps running! So, you’ll also:

    Run a script on your web server to monitor system health.

    Send an email with an alert if the server is ever unhealthy.

Hopefully this summary has helped you start thinking about how you’ll approach this task. In case you’re feeling a little scared, don't worry, you can definitely do this! You have all the necessary tools, and the lab description will go into a lot more detail of what you need to do.

Up next, we'll give you a few tips that can help you along the way.

# How to Approach the Problem

We're giving you a pretty big project to do at the end of this course -- but you can totally complete it with what you've learned until now! Take your time, and be methodical. Use these tips to help you:

Break the problem down into smaller pieces. If you’re not sure how to solve a piece of the puzzle, look for an even smaller piece that you can solve. Build up those smaller pieces into a larger solution!

Make one change at a time. Write unit tests to make sure that each new part of the solution works the way you think it does. Run your unit tests frequently to make sure that each part of your solution keeps working as you make changes.

Use version control. Check each part of your solution into version control as you complete it, so you can always roll back to a known version of your code if you make a mistake.

Review module documentation! You are going to need to use these modules to complete the final project. Reading the documentation takes time, but as you become more familiar with the APIs provided by these modules, it could save you from writing a bunch of custom code that could have just been a call to a module function! Remember, we’ve covered these modules in previous lessons too, so feel free to go back and review them if you need a refresher!

    Python Image Library (PIL) - Tutorial

    Requests (HTTP client library) - Quickstart

    ReportLab (PDF creation library)

    email (constructing email)

    psutil (processes and system utilization)

    shutil (file operations)

    smtplib (sending email)

Read the lab instructions carefully! Following the instructions and implementing your solution to the specifications that you’re given are critical to completing the task, and to being accurately graded! 

### Fetching supplier data

You'll first need to get the information from the supplier that is currently stored in a Google Drive file. The supplier has sent data as large images with an associated description of the products in two files (.TIF for the image and .txt for the description).

Here, you'll find two script files download_drive_file.sh and the example_upload.py files. You can view it by using the following command.

To download the file from the supplier onto our linux-instance virtual machine we will first grant executable permission to the download_drive_file.sh script.

`sudo chmod +x ~/download_drive_file.sh`

Run the download_drive_file.sh shell script with the following arguments:

`./download_drive_file.sh 1LePo57dJcgzoK4uiI_48S01Etck7w_5f supplier-data.tar.gz`

You have now downloaded a file named supplier-data.tar.gz containing the supplier's data. Let's extract the contents from this file using the following command:

`tar xf ~/supplier-data.tar.gz `

This creates a directory named supplier-data, that contains subdirectories named images and descriptions.

The subdirectory images contain images of various fruits, while the descriptions subdirectory has text files containing the description of each fruit. You can have a look at any of these text files using cat command.

`cat ~/supplier-data/descriptions/007.txt`

The first line contains the name of the fruit followed by the weight of the fruit and finally the description of the fruit.

### Working with supplier images

In this section, you will write a Python script named changeImage.py to process the supplier images. You will be using the PIL library to update all images within ~/supplier-data/images directory to the following specifications:

* Size: Change image resolution from 3000x2000 to 600x400 pixel
* Format: Change image format from .TIFF to .JPEG

This is the challenge section, where you will be writing a script that satisfies the above objectives.

> Note: The raw images from images subdirectory contains alpha transparency layers. So, it is better to first convert RGBA 4-channel format to RGB 3-channel format before processing the images. Use convert("RGB") method for converting RGBA to RGB image.

After processing the images, save them in the same path ~/supplier-data/images, with a JPEG extension.


#### changeImage.py

In [None]:
#!/usr/bin/env python3

from PIL import Image
import os

path = "supplier-data/images"

dirs=[]

for dir in os.listdir(path):
    if "tiff" in dir:
        name, ext = os.path.splitext(dir)
        full_loc = path + "/" + dir
        im = Image.open(full_loc)
        im = im.resize((600,400)).convert("RGB")
        im.save(path + "/" + name + ".jpeg", "JPEG")
    


Now, let's check the specifications of the images you just updated. Open any image using the following command:

`file ~/supplier-data/images/003.jpeg`

### Uploading images to web server

You have modified the fruit images through changeImage.py script. Now, you will have to upload these modified images to the web server that is handling the fruit catalog. To do that, you'll have to use the Python requests module to send the file contents to the [linux-instance-IP-Address]/upload URL.

Copy the external IP address of your instance from the Connection Details Panel on the left side and enter the IP address in a new web browser tab. This opens a web page displaying the text "Fruit Catalog".

In the home directory, you'll have a script named example_upload.py to upload images to the running fruit catalog web server. To view the example_upload.py script use the cat command.

#### example_upload.py

In [None]:
#!/usr/bin/env python3
import requests

# This example shows how a file can be uploaded using
# The Python Requests module

url = "http://localhost/upload/"
with open('/usr/share/apache2/icons/icon.sheet.png', 'rb') as opened:
    r = requests.post(url, files={'file': opened})


In this script, we are going to upload a sample image named icon.sheet.png.

Now check out that the file icon.sheet.png was uploaded to the web server by visiting the URL [linux-instance-IP-Address]/media/images/, followed by clicking on the file name.

In a similar way, you are going to write a script named supplier_image_upload.py that takes the jpeg images from the supplier-data/images directory that you've processed previously and uploads them to the web server fruit catalog.

Refresh the URL opened earlier, and now you should find all the images uploaded successfully.

### Uploading the descriptions

The Django server is already set up to show the fruit catalog for your company. You can visit the main website by entering linux-instance-IP-Address in the URL bar or by removing /media/images from the existing URL opened earlier. The interface looks like this:

Check out the Django REST framework, by navigating to linux-instance-IP-Address/fruits in your browser.

Currently, there are no products in the fruit catalog web-server. You can create a test fruit entry by entering the following into the content field:

```
{"name": "Test Fruit", 
"weight": 100, 
"description": "This is the description of my test fruit", 
"image_name": "icon.sheet.png"}
```

After entering the above data into the content field click on the POST button. Now visit the main page of your website (by going to http://[linux-instance-external-IP]/), and the new test fruit you uploaded appears.

To add fruit images and their descriptions from the supplier on the fruit catalog web-server, create a new Python script that will automatically POST the fruit images and their respective description in JSON format.

Write a Python script named run.py to process the text files (001.txt, 003.txt ...) from the supplier-data/descriptions directory. The script should turn the data into a JSON dictionary by adding all the required fields, including the image associated with the fruit (image_name), and uploading it to http://[linux-instance-external-IP]/fruits using the Python requests library.

#### supplier_image_upload.py

In [None]:
#!/usr/bin/env python3
import requests
import os

# This example shows how a file can be uploaded using
# The Python Requests module

url = "http://localhost/upload/"
path = "supplier-data/images"

dirs=[]

for dir in os.listdir(path):
    if "jpeg" in dir:
        name, ext = os.path.splitext(dir)
        full_loc = path + "/" + dir
        with open(full_loc, 'rb') as opened:
            r = requests.post(url, files={'file': opened})


Now, you'll have to process the .txt files (named 001.txt, 002.txt, ...) in the supplier-data/descriptions/ directory and save them in a data structure so that you can then upload them via JSON. Note that all files are written in the following format, with each piece of information on its own line:

* name
* weight (in lbs)
* description

The data model in the Django application fruit has the following fields: name, weight, description and image_name. The weight field is defined as an integer field. So when you process the weight information of the fruit from the .txt file, you need to convert it into an integer. For example if the weight is "500 lbs", you need to drop "lbs" and convert "500" to an integer.

The image_name field will allow the system to find the image associated with the fruit. Don't forget to add all fields, including the image_name! The final JSON object should be similar to:

```
{"name": "Watermelon", 
"weight": 500, 
"description": "Watermelon is good for relieving heat, eliminating annoyance and quenching thirst. It contains a lot of water, which is good for relieving the symptoms of acute fever immediately. The sugar and salt contained in watermelon can diuretic and eliminate kidney inflammation. Watermelon also contains substances that can lower blood pressure.", 
"image_name": "010.jpeg"}
```

Iterate over all the fruits and use post method from Python requests library to upload all the data to the URL http://[linux-instance-external-IP]/fruits

In [None]:
#! /usr/bin/env python3

import os
import requests
import json


dirs=[]

path = "supplier-data/descriptions/"

for dir in os.listdir(path):
    if "txt" in dir:
        name, ext = os.path.splitext(dir)

        with open(path + "/" + dir) as f:
          lines = [line.rstrip('\n') for line in f]

          desc={}
          desc["name"] = lines[0]
          desc["weight"] = lines[1].strip(" lbs")
          desc["description"] = lines[2]
          desc["image_name"] = f"{name}.jpeg"

          response = requests.post("http://34.28.179.72/fruits/", json=desc)
          response.raise_for_status()


Now go to the main page of your website (by going to http://[linux-instance-IP-Address]/) and check out how the new fruits appear.

### Generate a PDF report and send it through email

Once the images and descriptions have been uploaded to the fruit store web-server, you will have to generate a PDF file to send to the supplier, indicating that the data was correctly processed. To generate PDF reports, you can use the ReportLab library. The content of the report should look like this:

**Processed Update on <Today's date>**

[blank line]

name: Apple

weight: 500 lbs

[blank line]

name: Avocado

weight: 200 lbs

[blank line]

...

### Script to generate a PDF report
Using the reportlab Python library, define the method generate_report to build the PDF reports. We have already covered how to generate PDF reports in an earlier lesson; you will want to use similar concepts to create a PDF report named processed.pdf.

#### reports.py

In [None]:
#!/usr/bin/env python3

from reportlab.platypus import SimpleDocTemplate
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.platypus import Paragraph, Spacer, Table, Image


def generate_report(attachment, title, paragraph):
  report = SimpleDocTemplate(attachment)
  styles = getSampleStyleSheet()
  report_title = Paragraph(title, styles["h1"])
  report_content = Paragraph(paragraph, styles["BodyText"])
  empty_line = Spacer(1,20)

  report.build([report_title, empty_line,report_content])

### Send report through email

Once the PDF is generated, you need to send the email using the emails.generate_email() and emails.send_email() methods.

Define generate_email and send_email methods by importing necessary libraries.


#### emails.py

In [None]:
#!/usr/bin/env python3

import email.message
import mimetypes
import os.path
import smtplib


def generate(sender, recipient, subject, body, *attachments):
    message = email.message.EmailMessage()
    message["From"] = sender
    message["To"] = recipient
    message["Subject"] = subject
    message.set_content(body)

    for attachement in attachments:
        attachment_path = os.path.basename(attachement)
        mime_type, _ = mimetypes.guess_type(attachement)
        mime_type, mime_subtype = mime_type.split('/', 1)

        with open(attachement, 'rb') as ap:
            message.add_attachment(ap.read(), maintype=mime_type, subtype=mime_subtype, filename=attachment_path)
        
    return message

def send_email(message):
  mail_server = smtplib.SMTP('localhost')
  mail_server.send_message(message)
  mail_server.quit()


Create another script named report_email.py to process supplier fruit description data from supplier-data/descriptions directory. Use the following command to create report_email.py.

Import all the necessary libraries(os, datetime and reports) that will be used to process the text data from the supplier-data/descriptions directory into the format below:

name: Apple

weight: 500 lbs

[blank line]

name: Avocado

weight: 200 lbs

[blank line]

...

Once you have completed this, call the main method which will process the data and call the generate_report method from the reports module:

You will need to pass the following arguments to the reports.generate_report method: the text description processed from the text files as the paragraph argument, the report title as the title argument, and the file path of the PDF to be generated as the attachment argument (use ‘/tmp/processed.pdf')

`  reports.generate_report(attachment, title, paragraph)`

Once you define the generate_email and send_email methods, call the methods under the main method after creating the PDF report:

Use the following details to pass the parameters to emails.generate_email():

* From: automation@example.com
* To: username@example.com  -- Replace username with the username given in the Connection Details Panel on the right hand side.
* Subject line: Upload Completed - Online Fruit Store
* E-mail Body: All fruits are uploaded to our website successfully. A detailed list is attached to this email.
* Attachment: Attach the path to the file processed.pdf


#### report_email.py

In [None]:
#!/usr/bin/env python3

import os
from datetime import datetime
import sys
import reports
import emails


def write_content(path):
    report_content = ""
    for dir in os.listdir(path):
        if "txt" in dir: 
            with open(path + "/" + dir) as f:
                lines = [line.rstrip('\n') for line in f]
                
                report_content += "name: {}".format(lines[0])
                report_content += "<br />"
                report_content += "weight: {}".format(lines[1])
                report_content += "<br />"
    return report_content

def write_title():
    str_tdy = datetime.today().strftime('%Y-%m-%d')
    title = "Processed Update on {}".format(str_tdy)
    return title

def main(argv):
    path = "supplier-data/descriptions/"
    report_title = write_title()
    report_content = write_content(path)
    reports.generate_report("/tmp/processed.pdf", report_title, report_content)

    sender = "automation@example.com"
    receiver = "{}@example.com".format(os.environ.get('USER'))
    subject = "Upload Completed - Online Fruit Store"
    body = "All fruits are uploaded to our website successfully. A detailed list is attached to this email."
    message = emails.generate(sender, receiver, subject, body, "/tmp/processed.pdf")
    emails.send_email(message)


if __name__ == "__main__":
    main(sys.argv)

Now, check the webmail by visiting [linux-instance-external-IP]/webmail. Here, you'll need a login to roundcube using the username and password mentioned in the Connection Details Panel on the left hand side, followed by clicking Login.

Now you should be able to see your inbox, with one unread email. Open the mail by double clicking on it. There should be a report in PDF format attached to the mail. View the report by opening it.

### Health check

This is the last part of the lab, where you will have to write a Python script named health_check.py that will run in the background monitoring some of your system statistics: CPU usage, disk space, available memory and name resolution. Moreover, this Python script should send an email if there are problems, such as:

* Report an error if CPU usage is over 80%
* Report an error if available disk space is lower than 20%
* Report an error if available memory is less than 500MB
* Report an error if the hostname "localhost" cannot be resolved to "127.0.0.1"


Import the necessary Python libraries (eg. shutil, psutil) to write this script.

Complete the script to check the system statistics every 60 seconds, and in event of any issues detected among the ones mentioned above, an email should be sent with the following content:

* From: automation@example.com
* To: username@example.com -- Replace username with the username given in the Connection Details Panel on the right hand side.
* Subject line:
```
    CPU usage is over 80%                                              Error - CPU usage is over 80%
    Available disk space is lower than 20%                             Error - Available disk space is less than 20%
    available memory is less than 500MB                                Error - Available memory is less than 500MB
    hostname "localhost" cannot be resolved to "127.0.0.1"             Error - localhost cannot be resolved to 127.0.0.1
```
* E-mail Body: Please check your system and resolve the issue as soon as possible.

> Note: There is no attachment file here, so you must be careful while defining the generate_email() method in the emails.py script or you can create a separate generate_error_report() method for handling non-attachment email.


In [12]:
#!/usr/bin/env python3

import psutil
import shutil
import socket
import sys
import emails

def health_check():
    CPU_percent = psutil.cpu_percent()
    disk_usage = psutil.disk_usage('/').percent
    avail_mem = psutil.virtual_memory().available/ 1024 ** 2 
    local_host = socket.gethostbyname('localhost')

    conditions=[0,0,0,0]

    if CPU_percent > 80: conditions[0] = "Error - CPU usage is over 80%"
    if disk_usage < 0.2: conditions[1] = "Error - Available disk space is less than 20%"
    if avail_mem < 500: conditions[2] = "Error - Available memory is less than 500MB"
    if local_host != "127.0.0.1": conditions[3] = "Error - localhost cannot be resolved to 127.0.0.1"

    return conditions

def send_email(conditions):
    sender = "automation@example.com"
    receiver = "{}@example.com".format(os.environ.get('USER'))
    subject = " & ".join(str(condition) for condition in conditions if condition != 0)
    body = "Please check your system and resolve the issue as soon as possible."
    message = emails.generate(sender, receiver, subject, body, "/tmp/processed.pdf")
    emails.send_email(message)


def main(argv):
    conditions = health_check()
    if conditions != [0,0,0,0]:
        send_email(conditions)

if __name__ == "__main__":
    main(sys.argv)

Error - CPU usage is over 80%


Next, go to the webmail inbox and refresh it. There should only be an email something goes wrong, so hopefully you don't see a new email.

To test out your script, you can install the stress tool.

`sudo apt install stress`

Next, call the tool using a good number of CPUs to fully load our CPU resources:

`stress --cpu 8`

Allow the stress test to run, as it will maximize our CPU utilization. Now run health_check.py by opening another SSH connection to the linux-instance. Navigate to Accessing the virtual machine on the navigation pane on the right-hand side to open another connection to the instance.

Now run the script:

`./health_check.py`

Check your inbox for any new email.

Open the email with the subject "Error - CPU usage is over 80%" by double clicking it.

Close the stress --cpu command by clicking Ctrl-c.

Now, you will be setting a cron job that executes the script health_check.py every 60 seconds and sends health status to the respective user.

To set a user cron job use the following command:

`crontab -e`

Enter 1 to open in the nano editor. Now, set the complete path for health_check.py script, and save by clicking Ctrl-o, Enter key, and Ctrl-x.