# Working with images, audio, and other assets

## **Downloading media content from the web**

Downloading media content from the web is a simple process: use Requests or another library and download it just like you would HTML content.

### **Getting ready**

There is a class named `URLUtility` in the `urls.py` mdoule in the util folder of the solution.  This class handles several of the scenarios in this chapter with downloading and parsing URLs. We will be using this class in this recipe and a few others

### **How to do it**

In [1]:
import const
from urls import URLUtility

util = URLUtility(const.ApodEclipseImage())
print(len(util.data))

Reading URL: https://apod.nasa.gov/apod/image/1709/BT5643s.jpg
Read 171014 bytes
171014


## **Parsing a URL with urllib to get the filename**

When downloading content from a URL, we often want to save it in a file.  Often it is good enough to save the file in a file with a name found in the URL.  But the URL consists of a number of fragments, so how can we find the actual filename from the URL, especially where there are often many parameters after the file name?

### **How to do it**

In [2]:
util = URLUtility(const.ApodEclipseImage())
print(util.filename_without_ext)

Reading URL: https://apod.nasa.gov/apod/image/1709/BT5643s.jpg
Read 171014 bytes
BT5643s


### **How it works**

In the constructor for URLUtility, there is a call to urlib.parse.urlparse.  The following demonstrates using the function interactively:

```py
parsed = urlparse(const.ApodEclipseImage())
parsed
```
ParseResult(scheme='https', netloc='apod.nasa.gov', path='/apod/image/1709/BT5643s.jpg', params='', query='', fragment='')
The ParseResult object contains the various components of the URL.  The path element contains the path and the filename.  The call to the .filename_without_ext property returns just the filename without the extension:

```py
@property
def filename_without_ext(self):
    filename = os.path.splitext(os.path.basename(self._parsed.path))[0]
    return filename
```
The call to os.path.basename returns only the filename portion of the path (including the extension). `os.path.splittext()` then separates the filename and the extension, and the function returns the first element of that tuple/list (the filename).

## **Determining the type of content for a URL**

When performing a `GET` requests for content from a web server, the web server will return a number of headers, one of which identities the type of the content from the perspective of the web server.  In this recipe we learn to use that to determine what the web server considers the type of the content.

### **How to do it**

We prioceed  as follows:

In [3]:
util = URLUtility(const.ApodEclipseImage())
print("The content type is: " + util.contenttype)

Reading URL: https://apod.nasa.gov/apod/image/1709/BT5643s.jpg
Read 171014 bytes
The content type is: image/jpeg


### **How it works**

The `.contentype` property is implemented as follows:
```py
@property
def contenttype(self):
    self.ensure_response()
    return self._response.headers['content-type']
```
The `.headers` property of the `_response` object is a **dictionary-like class of headers**.  The content-type key will retrieve the content-type specified by the server.  This call to the `ensure_response()` method simply ensures that the `.read()` function has been executed.