<a href="https://colab.research.google.com/github/DolicaAkelloEgwel/python-slides/blob/main/python-for-beginners/python-for-beginners-image-scraper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python For Beginners

## Setup

### 1. Go to this link: bit.ly/3F5b88F
### 2. Log in to your Google account

## Overview

Part One:

+ The Python Interpreter
+ Hello World / Variables
+ Variable Reassignment
+ Numbers
+ Lists
+ Combining Things / Using Operators
+ Libraries
+ List Comprehensions

## Using Python Notebooks

Python Notebooks are made up of blocks of code called cells. These cells can be run by clicking on the play button on the left of each code block. If the code runs successfully then you will see a green tick appear on the left. If there is any output then it will appear beneath the cell.

Python Notebooks are handy resources for teaching as they allow me to combine explanation with examples. But do keep in mind that they do not play especially well with version control and for a more serious project you will probably want to learn how to write your own Python files that you can run locally.

## Connecting Drive

Run the cell below.

In [None]:
from google.colab import drive

drive.mount("/content/drive")

In the Files panel on the left-hand side, you should now see a `drive` folder. At the end of the workshop you will be able to save the images that you have created by moving them to `drive > MyDrive`.

![](connecting-drive.png)

## Some Python Basics

### Calculator

The Python interpreter can be used as a calculator.

**Exercise:** Find the value of `2 + 2` using Python.

In [None]:
# live coding goes here

### Comments

Comments are lines of text that are ignored by the Python interpreter. In Python you write comments with the `#` symbol. They can be used to indicate what a certain part of the code is doing to a fellow coder or your future self. 

You also make life easier for me when I help you out with your code at the CTL :)

In [None]:
# I am about to add some numbers
1 + 1  # The numbers are being added
# I have just added some numbers

### Variables and `print()`

Variables allow you to store values so that they can be used again or changed later. They are like labelled boxes for storing data. The `print()` command can be used to show the value of a variable.

![](variable-box.png)

It is possible to calculate something and store the value in a variable.

In [None]:
my_number = 2 + 2
print(my_number)

When you try to look for a variable that hasn't been defined (yet) you get an error.

In [None]:
print(this_variable_does_not_exist)

To create a text variable in Python we store it in something called a **string.** A string is a sequence of characters that is **enclosed with quotation marks.**

In [None]:
my_text = "Hello World!"
print(my_text)

**Exercise:** Create a variable called `greeting` containing the text `My name is your-name` then use the `print()` command to show it.

In [None]:
# live coding goes here

#### Naming Variables

There are rules for naming variables in Python. 

- A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
- A variable name cannot start with a number (`my_var1` and `my_var2` is fine, but `1my_var` is not)
- Variable names are case-sensitive (so `age`, `Age` and `AGE` would be three different variables)
- A variable name cannot be any of the [Python keywords](https://www.geeksforgeeks.org/python-keywords/).

![](reserved.jpeg)


### Reassignment

We can change the value of variable by using the `=` operator to give it another value. This is known as _reassignment_. Be aware that Notebooks allow you to execute cells in any order, meaning that it can be a bit trickier to ensure others get the same output you do.

In [None]:
my_text = "hello!"

In [None]:
# the problem with notebooks...
print(my_text)

In [None]:
my_text = "goodbye!"

This is equivalent to taking out what was previously in our box, tossing it away, and putting something else in that box. Something called the "garbage collector" in Python will periodically delete any data it finds that doesn't belong to a box.

### Numbers

Python has a number of different numeric types. In general, the ones you'll be using the most are whole numbers (Integers) and numbers with decimals places (Floats).

In [None]:
num = 10
pi = 3.14

We can use the `type()` command to see the data type of something.

In [None]:
type(num)

In [None]:
type("this is a string")

In [None]:
type(1 / 3)

### Lists

A list is a collection of different data that is stored in a single variable.

In [None]:
my_list = ["text-1", "text-2", "text-3"]
print(my_list)

To access an item from a list we need an index. An index is a number that corresponds with the item's place in the list. Because Python has zero-based indexing, the first element in the list (which is `"text-1"` in this case) has an index of 0. The second item has an index of 1, and the third item has an index of 2. 

To access an item in a list you use the name of the list followed by the index of the item you want to access enclosed in square brackets.

In [None]:
print(my_list[0])

You can also create empty lists. Sometimes you know that a particular problem will need a list but that the list's content won't be ready until later on in your program.

In [None]:
# live coding goes here - don't copy this

<details>
<summary>Code example: Creating empty lists</summary>

```python
# You could do it like this
empty_list = []
# Or like this...
also_an_empty_list = list()

# Both will work
print(empty_list)
print(also_an_empty_list)

# In case you want to double-check that empty lists are still lists
print(type(empty_list))
print(type(also_an_empty_list))
```

</details>

So let's say I want to do some sort of operation on every item in my list. How could I do this...?

In [None]:
# live coding goes here - don't copy this as this is a bad example!

### For Loops

For-loops allow you to repeat a block of code multiple times. They are especially helpful for doing things with lists.

In [None]:
cool_list = [1, 2, 3, 4, 5]
for num in cool_list:
    print(num)

By the way, `num` is a kind of "nickname" that I am using for the items in my list. It can actually be anything provided it's a valid variable name. This will work too.

In [None]:
for elephant_and_castle in cool_list:
    print(elephant_and_castle)

Loops allow us to peform the same operation on every item in a list. They are like the programming equivalent of a conveyor belt.

![](cake-factory.jpg)

Indentation controls what is inside and outside our loop. You should think of it as being part of the code as well.

In [None]:
# live coding goes here - don't copy this

<details>
<summary>Code example: Loops and indentation</summary>

```python
for i in range(3):
    print("Hello from inside the loop")
print("Hello from outside the loop")
```

</details>

Having indentation out of nowhere confuses Python.

In [None]:
# live coding goes here    
    print("Hello")

By the same token, a lack of indentation code where it is expecting some also confuses Python.

In [None]:
# live coding goes here
for _ in range(5):
print("Hello")

### Combining Things with `+`

The `+` operator doesn't just allow us to add numbers. It can also be used to combine some of the different data types in Python. Below you can see that it can be used to combine lists.

In [None]:
first_list = ["aaa", "bbb", "ccc"]
second_list = [1, 2, 3]

combined_list = first_list + second_list
print(combined_list)

**Exercise**: What will I get from the code below?

In [None]:
print(second_list + first_list)

The `+` operator can also be used with strings.

In [None]:
hello = "hello"
world = "world"
hello_world = hello + world
print(hello_world)

**Exercise:** The code below will take a bit of text called `first_text` and combine it with another text called `second_text`. Afterwards it will then `print()` the combined text. However, the lines in the code are out of order. What would be the right order for the code?

You probably want to use Ctrl/Command+X and Ctrl/Command+C. If you hit copy cell you'll be copying the _entire_ block with all the code. This is now what you want. Another example of why Notebooks are a bit uh-oh at times...

<details>
<summary>Stuck? Click here for a hint.</summary>

Remember that in programming, the lines of code are executed in a sequential manner from **top to bottom**. This means that you cannot perform any operations or manipulations with a variable <em>before</em> it has been created or defined in the code.

If you encounter an error stating that a specific variable is not defined, it indicates that the computer has been instructed to perform an operation or access a value that, from its perspective, does not exist at that point in the code.

Remember that `NameError` means that a variable is undefined and does not exist (yet).

</details>

In [None]:
# live coding goes here
combined_text = first_part + second_part
first_part = "Hello, my name is "
print(combined_text)
second_part = "name-goes-here."

### Extra: Comprehensions

Comprehensions are the "Pythonic" way of doing things with lists. You don't have to do it this way but you may find it interesting...

In [None]:
my_list = [i for i in range(5)]
print(my_list)

<details>
<summary>Click here if you want to know what <code>range()</code> is doing.</summary>

`range()` is a built-in command that creates a sequence of numbers. By default it starts from 0, increases by 1, and stops before a specified number. In the example above, the output from `range()` is being placed in a list.
</details>

Now let's try doing the same thing witout a comprehension...

In [None]:
# live coding goes here - don't copy this

<details>
<summary>Code example: Creating a list of numbers without a comprehension.</summary>

```python
# Create an empty list
my_list = []

# Have a loop that goes from 0 to 4
for i in range(5):
    # Add a number at the end of the list during each stage of the loop
    my_list.append(5)
# Print the list
print(my_list)
```

This will give you the same result as the cell above. But comprehensions allow you to create a list and populate it in a single line.

</details>

### Using Libraries

+ Code written by other developers
+ Good chance someone has tried to solve the same problem as you
+ Don't have to reinvent the wheel
+ Tested and optimised solutions for common problems

I found this `emoji` library after a quick search on Google. You can find out more about it [here](https://carpedm20.github.io/emoji/docs/).

In Python Notebooks you install a library with the command `%pip install a-helpful-library` but in the terminal/console it's just `pip install a-helpful-library`.

Afterwards we import it using the `import` command. This is just `import a-helpful-library`. Simply installing a library doesn't mean you can use it. You might install 20 libraries but work on a project that only needs three of them - this is why a library needs to be _imported_ before it can be used.

In [None]:
%pip install emoji
import emoji

Now we can use the `emojize` command that is given to us by the `emoji` library.

In [None]:
emojified_text = emoji.emojize("There is a :snake: in my boot!")
print(emojified_text)

So now we can use this and some loops to get Daft Punk lyrics...

In [None]:
around_the_world_count = 144
for _ in range(around_the_world_count):
    print(emoji.emojize("Around the :globe_showing_Americas:"))

## Making an Image Scraper

For this portion of the workshop we'll use Python to download some images from the web. We will then use some libraries to add "glitchy" effects to them.

This is the "fun" part of the workshop where I show you an example of how you might use Python in an actual project. You don't need to worry about understanding every little thing.

To start with, we're going to need some libraries to help us download images.

In [None]:
from bs4 import BeautifulSoup
import requests

### Getting Stuff from the Web

[Shopify](https://www.shopify.com/stock-photos) is a website that has a large collection of free stock photos. The code below is a **function** that is capable of downloading images from Shopify.

Functions in Python are blocks of reusable code that are used to perform specific tasks. They are defined using the `def` keyword and allow you to break down complex tasks into smaller parts. Functions can take input values, called arguments, and return output values. Function help with making your code more modular, more resuable, and more maintainable.

In [None]:
def photo_downloader(image_theme):
    """Downloads images from Shopify based on an image theme.

    Args:
        image_theme: The search term that will be used for Shopify.

    Returns:
        A list of 30 or less images that appeared in the search results.
    """
    # Get the website
    r = requests.get(
        f"https://www.shopify.com/stock-photos/photos/search?q={image_theme}",
        timeout=20,
    )

    # Boot up the BeautifulSoup HTML parser
    soup = BeautifulSoup(r.text, "html.parser")

    # this is annoying...
    print("Just a moment" in r.text)

    # Find the images in the website
    urls = soup.findAll("img", src=True)

    # Create an empty list
    images = []

    # Loop through the image URLs
    for url in urls:
        try:
            # Download the image and add it to the list
            img_url = requests.get(url.attrs["src"])
            images.append(img_url.content)
        except:
            # Give up if the image failed to download and try again with the next one
            pass

        # Quit once we have at least 30 images in our list
        if len(images) == 30:
            break

    # Return the images
    print(f"Downloaded {len(images)} images.")
    return images

You might notice that beneath the function _header_ I have placed a special type of comment called a **docstring**. Docstrings are longer comments that describe what a function does. They do not affect how a function behaves, but they make your code much easier to understand.

Now we can run or _call_ the function above by providing it with an argument for the `image_theme`. For this example I'm using the word "robots" but you're free to change it to whatever interests you. I am taking the list that is _returned_ by the function and saving it to a variable called `image_bytes`.

In [None]:
theme = "cats"
image_bytes = photo_downloader(theme)

The bytes format can't be displayed easily and the other libraries that can do things with images don't know how to read it. To work around this, we need a function that convert from bytes to a different format.

Pillow or PIL is a a Python library that allows us to work with images. It provides a certain format for images called (drumroll) `Image`. To make a function that can convert data from bytes to `Image` we'll need some extra imports.

In [None]:
from PIL import Image
from io import BytesIO

### Functions

Here is another function that we'll use for converting the Shopify data from bytes to the PIL `Image` format. Like before, I've added a docstring to show what the function is for.

In [None]:
def bytes_to_image(image_bytes):
    """Converts bytes to a PIL Image.

    Args:
        image_bytes: The image in bytes form.

    Returns:
        The image in PIL Image form.
    """
    return Image.open(BytesIO(image_bytes))

Now I can convert all the photos in the list using the new function and a comprehension. The PIL `Image` can be displayed in a Python Notebook by giving its name or its location in a list. That's something we can't do with `bytes`.

In [None]:
downloaded_photos = [bytes_to_image(bytes) for bytes in image_bytes]
downloaded_photos[0]

### Saving the Images

The `os` library can be used to create folders on your computer. As it is part of the Python Standard Library, it doesn't need to be installed. It's already included with a Python installation.

In [None]:
import os

# Choose a path for our workshop images
output_folder_name = "./drive/MyDrive/python-workshop"
# Ask the os library to create this folder
os.makedirs(output_folder_name, exist_ok=True)

# Pick a name for the folder in which the scraper images will be saved
picture_folder_name = "scraper-pictures"
# Create a combined path name
scraper_pictures_path = os.path.join(output_folder_name, picture_folder_name)
# Create a new "scraper-pictures" folder in Google Drive
os.makedirs(scraper_pictures_path, exist_ok=True)

The `Image` library also has a built-in `.save()` command. This can be used to save the photos that were downloaded.

The function below will take the `Image` object, the `theme` that we chose earlier, the count (a number that we will give to each of the photos), and the folder name. Using this, it will create a filename in the form of `folder-name/theme-XX.jpg` where `XX` is the count in two digit form.

The helpful thing about `os.path` is that it guarantees the path will work no matter what operating system you're using.

In [None]:
def photo_saver(image, theme, count, folder_name):
    """Saves a PIL Image to the disk.

    Args:
        image: The PIL Image to save.
        theme: The image theme.
        count: The image count.
        folder_name: The name of the folder that the image will be saved to.
    """
    # Create a filename for the image - using os.path helps ensure that things go well no matter what type of system you're using
    img_filename = os.path.join(folder_name, f"{theme}-{count:02d}.jpg")

    # Save the image using the filename we have created
    image.save(img_filename)

    # Print a message for assurance that something happened
    print(f"Saved {img_filename}")

Now I can use a loop to go through the images one by one and use the `photo_saver()` function on them. 

<details>
<summary>Click here if you want to know what <code>enumerate</code> is doing?</summary>

In Python `enumerate` is a special way of looping that lets me go through every item in a list, but also gives me a number corresponding with the item's location in the list.

Example:
```python
list_of_animals = ["chicken", "frog", "dolphin"]
for count, animal in enumerate(list_of_animals):
    print(count, animal)
```
That will give the following output:
```
0 chicken
1 frog
2 dolphin
```

I am using enumerate in this scenario because I want the image data as well as a number to be passed to my saving function so that it can create image filenames in the format `theme-XX.jpg` with `XX` being the number of the image.

</details>

In [None]:
# Go through the downloaded images one by one and save them into the folder that was just created
for count, img in enumerate(downloaded_photos):
    photo_saver(img, theme, count, scraper_pictures_path)

# Extra: Applying a Glitch Effect

![](glitched-image-example.jpg)

We can use a library called `glitch-this` to add a glitchy effect to the images. But first let's shrink the images slightly in order to give the glitching functions less work to do.

In [None]:
MAX_WIDTH = 800


def image_resize(image):
    """Resizes an image so that its width does not go beyond 800 pixels.

    Args:
        image: The image to resize.

    Returns:
        The resized image.
    """
    # Do nothing if the image is already small
    if image.size[0] <= MAX_WIDTH:
        return image

    # Otherwise resize the image
    factor = MAX_WIDTH / image.size[0]
    return image.resize((MAX_WIDTH, int(image.size[1] * factor)))


# Send all of our images to this function with a comprehension
downloaded_photos = [image_resize(photo) for photo in downloaded_photos]

Python comes with a library called `random` that can choose a random item from a list. Because it is included with Python there is no need to install it with `pip`. Now let's pick a random photo from our list of downloaded photos and look at it.

In [None]:
import random

random_photo = random.choice(downloaded_photos)
random_photo

A library called `glitch-this` can be used for adding glitchy effects to images. Here is a [link](https://github.com/TotallyNotChase/glitch-this/wiki/Documentation:-The-glitch-this-library) to its documentation.

Like before it's installed using the command `%pip install glitch-this`.

In [None]:
%pip install glitch-this
from glitch_this import ImageGlitcher

Now with the library installed and imported, we can apply the glitch effect to the random photo and see what it looks like afterwards.

The documentation goes into more detail about what the different parameters for the `glitch_image` command are doing. (They can be tought of as "settings" that change the type of glithiness we get.)

The inputs that we must give to this command are the `src_img` that we want to glitch, a `glitch_amount` float value, and something called `color_offset`. In the documentation it says that the `glitch_amount` may be any number from 0.1 to 10.0. Feel free to mess around with this value if you like.

The developers have said setting `color_offset` to `True` makes things look best, so I'll just take their word for it...

In [None]:
glitcher = ImageGlitcher()

glitched_image = glitcher.glitch_image(
    src_img=random_photo, glitch_amount=3.5, color_offset=True
)
glitched_image

Now to make things more interesting we can warp the image even further by using another glitching library called `pixelsort`.

In [None]:
%pip install pixelsort
from pixelsort import pixelsort

The [pixelsort documentation](https://github.com/satyarth/pixelsort) says a bit about what the different inputs do. I was pretty lost while looking it up, so I just messed around until I found some that I liked the most. Try changing these values if you are unsatisfied with the result and see if that makes things better.

<!-- Add something here about why it's good to eff around until something cool happens. -->

In [None]:
sort_image = pixelsort(
    glitched_image, sorting_function="intensity", interval_function="edges"
)

sort_image = sort_image.convert("RGB")
sort_image.save(os.path.join(output_folder_name, "glitched-image.jpg"))
sort_image

Now you should see this image and the scraper images in your Google Drive folder.

## Recap

+ Python Fundamentals
+ Using Python to download things
+ Going from one type of data to another
+ Using several libraries together to create more interesting programs
+ Python as a tool for adding effects to photos

## Tips for Learning Programming

+ You absolutey _don't_ need to learn/memorise everything
+ Most people remember a handful of things they use the most and look up the rest
+ Making mistakes is normal - [even the pros do it](https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/issues/123)

<!-- Maybe add multiple examples here. -->

## FutureCoder

[FutureCoder](https://futurecoder.io) is a helpful and beginner-friendly Python programming course than can be run in your browser.

## Feedback

P-please fill in my workshop feedback form.  👉👈

https://moodle.arts.ac.uk/mod/feedback/view.php?id=1365721

## Python Notebooks Rant

Watch [this](https://www.youtube.com/watch?v=7jiPeIFXb6U&t=0s) in case you're wondering why some people have a problem with Python Notebooks...

On the other hand, [this person](https://www.youtube.com/watch?v=9Q6sLbz37gk) wants to stick up for Notebooks. Just making sure you know all the sides of the argument :)

## More Stuff

+ GANs with Python
+ Introduction to GitHub and Version Control - not specific to any language!
+ I am slowly creating some [guides](https://github.com/creativetechnologylab/coding-tutorials) for Python programming and dealing with common issues - check it out!

## Attendance

Please give me your names so I can keep track of attendance!