In case the `requests` module is not installed on your system, install it with:

```bash
pip install requests 
```

However, it should be part of the Anaconda distribution.

This material is based on http://docs.python-requests.org/en/master/user/quickstart/#quickstart and on chapter 17 in *Python Crash Course*, Eric Matthes.


We will have a look at the following *application programming interfaces (API)*:

  * http://openweathermap.org/api
  * https://developer.github.com/v3/
  
  

## URLs?

Usually, our URLs for working with remote *Representational state transfer (REST)* APIs consist of a host, a path and a query. In this tutorial we will work with the *Hypertext Transfer Protocol (HTTP)* only. 


```
                    hierarchical part
        ┌───────────────────┴─────────────────────┐
                    authority               path
        ┌───────────────┴───────────────┐┌───┴────┐
  abc://username:password@example.com:123/path/data?key=value&key2=value2#fragid1
  └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └─────────┬─────────┘ └──┬──┘
scheme  user information     host     port                  query         fragment
```


The example above is from https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Examples.

# Working with APIs on the CLI

On the CLI in a Unix environment, you have usually access either to `curl` or to `wget`. Both are similar and allow -amongst others- to interact with HTTP-based REST APIs.

In [49]:
%%bash
curl https://api.github.com/search/repositories?q=language:python&sort=stars

{
  "total_count": 3070054,
  "incomplete_results": false,
  "items": [
    {
      "id": 21289110,
      "node_id": "MDEwOlJlcG9zaXRvcnkyMTI4OTExMA==",
      "name": "awesome-python",
      "full_name": "vinta/awesome-python",
      "private": false,
      "owner": {
        "login": "vinta",
        "id": 652070,
        "node_id": "MDQ6VXNlcjY1MjA3MA==",
        "avatar_url": "https://avatars2.githubusercontent.com/u/652070?v=4",
        "gravatar_id": "",
        "url": "https://api.github.com/users/vinta",
        "html_url": "https://github.com/vinta",
        "followers_url": "https://api.github.com/users/vinta/followers",
        "following_url": "https://api.github.com/users/vinta/following{/other_user}",
        "gists_url": "https://api.github.com/users/vinta/gists{/gist_id}",
        "starred_url": "https://api.github.com/users/vinta/starred{/owner}{/repo}",
        "subscriptions_url": "https://api.github.com/users/vinta/subscriptions",
        "organizations_url": "https:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0  172k    0   295    0     0    269      0  0:10:58  0:00:01  0:10:57   269 59  172k   59  102k    0     0  57715      0  0:00:03  0:00:01  0:00:02 57683100  172k  100  172k    0     0  82640      0  0:00:02  0:00:02 --:--:-- 82679


In [None]:
%%bash

wget -O - https://api.github.com/search/repositories?q=language:python&sort=stars

# Working with APIs from Python
 

## Make a Request

For this tutorial we are mostly collecting information with HTTPs `GET` request. Similarly, the `requests` module supports HTTP `POST`, `PUT`, `DELETE`, `HEAD`, and `OPTIONS` via corresponding functions in `requests`. See http://docs.python-requests.org/en/master/user/quickstart/#make-a-request for more details.

You can access the status code of an HTTP request via the `status_code` attribute. 

In [50]:
import requests


url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'

r = requests.get(url)
print(type(r))
print(r.url)
print(r.status_code)
print(r.json())

results = r.json()['items']

<class 'requests.models.Response'>
https://api.github.com/search/repositories?q=language:python&sort=stars
200
{'total_count': 3070063, 'incomplete_results': False, 'items': [{'id': 21289110, 'node_id': 'MDEwOlJlcG9zaXRvcnkyMTI4OTExMA==', 'name': 'awesome-python', 'full_name': 'vinta/awesome-python', 'private': False, 'owner': {'login': 'vinta', 'id': 652070, 'node_id': 'MDQ6VXNlcjY1MjA3MA==', 'avatar_url': 'https://avatars2.githubusercontent.com/u/652070?v=4', 'gravatar_id': '', 'url': 'https://api.github.com/users/vinta', 'html_url': 'https://github.com/vinta', 'followers_url': 'https://api.github.com/users/vinta/followers', 'following_url': 'https://api.github.com/users/vinta/following{/other_user}', 'gists_url': 'https://api.github.com/users/vinta/gists{/gist_id}', 'starred_url': 'https://api.github.com/users/vinta/starred{/owner}{/repo}', 'subscriptions_url': 'https://api.github.com/users/vinta/subscriptions', 'organizations_url': 'https://api.github.com/users/vinta/orgs', 'repos_

In [3]:
len(results)

30

In [51]:
summary = [(el['full_name'], el['stargazers_count'], el['html_url'], 
            el['description']) for el in results[:10]]

for name, stars, url, desc in summary:
    print(name)
    print(stars)
    print(url)
    print(desc)
    print('---------------')

vinta/awesome-python
55970
https://github.com/vinta/awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
---------------
donnemartin/system-design-primer
49925
https://github.com/donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview.  Includes Anki flashcards.
---------------
toddmotto/public-apis
42699
https://github.com/toddmotto/public-apis
A collective list of public JSON APIs for use in web development.
---------------
rg3/youtube-dl
42515
https://github.com/rg3/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
---------------
tensorflow/models
42441
https://github.com/tensorflow/models
Models and examples built with TensorFlow
---------------
pallets/flask
39237
https://github.com/pallets/flask
The Python micro framework for building web applications.
---------------
nvbn/thefuck
37667
https://github.com/nvbn/thefuck
Magnificent app which corre

## Passing Parameters In URLs

For getting the weather forecast, we need to specify, for example for which place we forecast, in which format we want to receive the response, etc. All those paramters are passed as a dictionary into the `params` keyword argument.

In [5]:
import json
import api_keys
import requests


url = "http://api.openweathermap.org/data/2.5/forecast"
query = {'q': 'Copenhagen,dk', 
         'mode': 'json',                       
         'units': 'metric',
         'appid': api_keys.OWM_API_KEY}
r = requests.get(url, params=query)

r.json()

{'city': {'coord': {'lat': 55.6867, 'lon': 12.5701},
  'country': 'DK',
  'id': 2618425,
  'name': 'Copenhagen',
  'population': 15000},
 'cnt': 40,
 'cod': '200',
 'list': [{'clouds': {'all': 76},
   'dt': 1521039600,
   'dt_txt': '2018-03-14 15:00:00',
   'main': {'grnd_level': 1024.4,
    'humidity': 98,
    'pressure': 1024.4,
    'sea_level': 1026.58,
    'temp': 1.88,
    'temp_kf': -0.66,
    'temp_max': 2.55,
    'temp_min': 1.88},
   'rain': {'3h': 0.0125},
   'snow': {'3h': 0.001},
   'sys': {'pod': 'd'},
   'weather': [{'description': 'light rain',
     'icon': '10d',
     'id': 500,
     'main': 'Rain'}],
   'wind': {'deg': 103.001, 'speed': 1.87}},
  {'clouds': {'all': 68},
   'dt': 1521050400,
   'dt_txt': '2018-03-14 18:00:00',
   'main': {'grnd_level': 1025.3,
    'humidity': 94,
    'pressure': 1025.3,
    'sea_level': 1027.59,
    'temp': 0.96,
    'temp_kf': -0.5,
    'temp_max': 1.45,
    'temp_min': 0.96},
   'rain': {'3h': 0.0425},
   'snow': {'3h': 0.049},
   'sy

## Response Content

### As Text

`requests` will automatically decode content from the server. Most unicode charsets are seamlessly decoded.

When you make a request, `requests` makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by `requests` is used when you access `r.text`.

In [6]:
import requests

# A call to the Github timeline
r = requests.get('https://api.github.com/events')
# response encoding
print(r.encoding)
# response content
print(r.text)

utf-8
[{"id":"7378610262","type":"PullRequestEvent","actor":{"id":1424573,"login":"johno","display_login":"johno","gravatar_id":"","url":"https://api.github.com/users/johno","avatar_url":"https://avatars.githubusercontent.com/u/1424573?"},"repo":{"id":14106970,"name":"tachyons-css/tachyons","url":"https://api.github.com/repos/tachyons-css/tachyons"},"payload":{"action":"closed","number":545,"pull_request":{"url":"https://api.github.com/repos/tachyons-css/tachyons/pulls/545","id":174873422,"html_url":"https://github.com/tachyons-css/tachyons/pull/545","diff_url":"https://github.com/tachyons-css/tachyons/pull/545.diff","patch_url":"https://github.com/tachyons-css/tachyons/pull/545.patch","issue_url":"https://api.github.com/repos/tachyons-css/tachyons/issues/545","number":545,"state":"closed","locked":false,"title":"Upgrade normalize.css from 7.0.0 to 8.0.0","user":{"login":"brianvanburken","id":1044316,"avatar_url":"https://avatars1.githubusercontent.com/u/1044316?v=4","gravatar_id":"","

### JSON Response Content

`requests` has a builtin JSON decoder, which returns the JSON response decoded into a dictionary.

In [None]:
r.json()

### Binary Response Content

You can also access the response body as bytes, for example when you request a file or an image.

In [7]:
r.content

b'[{"id":"7378610262","type":"PullRequestEvent","actor":{"id":1424573,"login":"johno","display_login":"johno","gravatar_id":"","url":"https://api.github.com/users/johno","avatar_url":"https://avatars.githubusercontent.com/u/1424573?"},"repo":{"id":14106970,"name":"tachyons-css/tachyons","url":"https://api.github.com/repos/tachyons-css/tachyons"},"payload":{"action":"closed","number":545,"pull_request":{"url":"https://api.github.com/repos/tachyons-css/tachyons/pulls/545","id":174873422,"html_url":"https://github.com/tachyons-css/tachyons/pull/545","diff_url":"https://github.com/tachyons-css/tachyons/pull/545.diff","patch_url":"https://github.com/tachyons-css/tachyons/pull/545.patch","issue_url":"https://api.github.com/repos/tachyons-css/tachyons/issues/545","number":545,"state":"closed","locked":false,"title":"Upgrade normalize.css from 7.0.0 to 8.0.0","user":{"login":"brianvanburken","id":1044316,"avatar_url":"https://avatars1.githubusercontent.com/u/1044316?v=4","gravatar_id":"","url"

## Writing Response to a file

In [1]:
import requests


user_url = 'https://api.github.com/users/HelgeCPH'
r = requests.get(user_url)
img_url = r.json()['avatar_url']
print(img_url)
r = requests.get(img_url)

filename = './avatar.jpg'

with open(filename, 'wb') as fd:
    fd.write(r.content)

https://avatars0.githubusercontent.com/u/21216985?v=4


![](avatar.jpg)

### Download Large Files or Responses

In case you have a large file that you want to save, then it is a good idea to save the stream of data coming in, by chopping it into smaller blocks of data and saving them sequentially.

In [None]:
with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size=1024):
        fd.write(chunk)

## Custom Headers, Authentication, Response Headers

If you want to send your request with a customized header, then you can just pass your header as a dictionary to the `headers` keyword argument of your request funtion call.

For example, one way to authenticate to the Github API, is by sending an API token in the header. Thereby, you increase the amount of possible requests to 5000 per hour. Not to make the following code run you have to first generate a Github API token (https://github.com/blog/1509-personal-api-tokens) and add it to our token module. There are many other possible ways for authorization, see http://docs.python-requests.org/en/master/user/authentication/#authentication.

The header of a response is accessible as a dictionary via the `headers` attribute on the response object. In the following example, we have to inspect the response header to get the links to more results, as the Github API returns results split accross many pages.

In [None]:
%%bash
echo "GITHUB_API_KEY = 'YOUR_API_KEY'" >> ./api_keys.py

!!!! ADD api_keys.py to .gitignore!!!

In [52]:
import api_keys
import requests
from tqdm import tqdm
from datetime import datetime
from urllib.parse import urlparse


url = 'https://api.github.com/repos/pallets/flask/contributors'
headers = {'Authorization': 'token {}'.format(api_keys.GITHUB_API_KEY)}

r = requests.get(url, headers=headers)
    
print(r.headers['X-RateLimit-Remaining'])
print(r.headers['X-RateLimit-Reset'])
print(datetime.fromtimestamp(int(r.headers['X-RateLimit-Reset'])))

contributors = [(contrib['login'], contrib['contributions'], contrib['html_url'])
                for contrib in r.json()]
 
print(r.headers)

4985
1539073814
2018-10-09 10:30:14
{'Server': 'GitHub.com', 'Date': 'Tue, 09 Oct 2018 08:02:31 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Status': '200 OK', 'X-RateLimit-Limit': '5000', 'X-RateLimit-Remaining': '4985', 'X-RateLimit-Reset': '1539073814', 'Cache-Control': 'private, max-age=60, s-maxage=60', 'Vary': 'Accept, Authorization, Cookie, X-GitHub-OTP', 'ETag': 'W/"2b05baffb5d475ae058ca0e98eb2445e"', 'Last-Modified': 'Tue, 09 Oct 2018 08:00:28 GMT', 'X-OAuth-Scopes': '', 'X-Accepted-OAuth-Scopes': '', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Link': '<https://api.github.com/repositories/596892/contributors?page=2>; rel="next", <https://api.github.com/repositories/596892/contributors?page=14>; rel="last"', 'Access-Control-Expose-Headers': 'ETag, Link, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval', 'Access-Control-Allow-Origin':

In [54]:
def gen_next_links(headers_link_str):
    next_page_str, last_page_str = headers_link_str.split(',')
    next_page_link = next_page_str.split(';')[0][1:-1]
    link_base = next_page_link[:-1]
    start_idx = int(urlparse(next_page_link).query.split('=')[1])
    last_page_link = last_page_str.split(';')[0][2:-1]
    end_idx = int(urlparse(last_page_link).query.split('=')[1])
    return [link_base + str(idx) for idx in range(start_idx, end_idx + 1)]


next_urls = gen_next_links(r.headers['Link'])
for next_url in tqdm(next_urls):
    r = requests.get(next_url, headers=headers)
    contributors += [(contrib['login'], contrib['contributions'], contrib['html_url'])
                     for contrib in r.json()]

100%|██████████| 13/13 [00:13<00:00,  1.01s/it]


In [55]:
import pygal


print('There are {} contributors to Flask.'.format(len(contributors)))

chart = pygal.Bar(x_label_rotation=80, show_legend=False, spacing=170, 
                  height=1000, width=4000)
chart.title = 'Contributions to Flask on GitHub'

names, no_contrib, _ = zip(*contributors)

values = []
for label, value, link in contributors:
    s_dict = {
    'value': value,
    'label': label,
    'xlink': {'href': link}}
    values.append(s_dict)


chart.x_labels = names
chart.add('', values) 
chart.render_in_browser('contrib_flask.svg')

There are 404 contributors to Flask.


TypeError: render_in_browser() takes 1 positional argument but 2 were given

![](contrib_flask.svg)

# An intro to `multiprocessing`



In [7]:
contributor_urls = ['https://api.github.com/repositories/596892/contributors?page=1'] + next_urls
contributor_urls

['https://api.github.com/repositories/596892/contributors?page=1',
 'https://api.github.com/repositories/596892/contributors?page=2',
 'https://api.github.com/repositories/596892/contributors?page=3',
 'https://api.github.com/repositories/596892/contributors?page=4',
 'https://api.github.com/repositories/596892/contributors?page=5',
 'https://api.github.com/repositories/596892/contributors?page=6',
 'https://api.github.com/repositories/596892/contributors?page=7',
 'https://api.github.com/repositories/596892/contributors?page=8',
 'https://api.github.com/repositories/596892/contributors?page=9',
 'https://api.github.com/repositories/596892/contributors?page=10',
 'https://api.github.com/repositories/596892/contributors?page=11',
 'https://api.github.com/repositories/596892/contributors?page=12',
 'https://api.github.com/repositories/596892/contributors?page=13',
 'https://api.github.com/repositories/596892/contributors?page=14']

In [None]:
import os
import sys
import time
import logging
import requests
import api_keys
from multiprocessing import Pool, cpu_count


HEADER = {'Authorization': f'token {api_keys.GITHUB_API_KEY}'}
contributor_urls = [f'https://api.github.com/repositories/596892/contributors?page={idx}' for idx in range(1, 15)]


def hard_work(a_url):
    print(f'{__name__}/{os.getppid()}/{os.getpid()} gets data from {a_url}')
    r = requests.get(a_url, headers=HEADER)
    time.sleep(6)
    print('Done')
    return [(contrib['login'], contrib['contributions'],
             contrib['html_url']) for contrib in r.json()]
    # return []


def run_sequential_download():
    contributors = []
    logging.info('Running the sequential program.')
    start = time.time()
    for contributor_url in contributor_urls:
        contributors += hard_work(contributor_url)
    print(f'It took {time.time() - start}s in total.')

    return contributors


def run_parallel_processes():
    workers = cpu_count()
    pool = Pool(processes=workers)

    print('Running the concurrent program.')
    start = time.time()
    result = pool.map(hard_work, contributor_urls)

    print(f'It took {time.time() - start}s in total.')
    return result


if __name__ == '__main__':
    if sys.argv[1] == '-s':
        run_sequential_download()
    elif sys.argv[1] == '-p':
        run_parallel_processes()

Process ForkPoolWorker-82:
Process ForkPoolWorker-85:
Process ForkPoolWorker-88:
Process ForkPoolWorker-83:
Process ForkPoolWorker-81:
Process ForkPoolWorker-84:
Process ForkPoolWorker-86:
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
Traceback (most recent call last):
Traceback (most recent call last):
Process ForkPoolWorker-87:
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
Traceback (most recent call last):
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/pool.py", line 108, in worker
    task = get()
  File "/Users/rhp/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, i

Run in one terminal `$ top` and in another one `$ python multiprocessing_example.py -p`

~~~bash
Running the concurrent program.
__main__/21782/21783 gets data from https://api.github.com/repositories/596892/contributors?page=1
__main__/21782/21784 gets data from https://api.github.com/repositories/596892/contributors?page=2
__main__/21782/21785 gets data from https://api.github.com/repositories/596892/contributors?page=3
__main__/21782/21786 gets data from https://api.github.com/repositories/596892/contributors?page=4
__main__/21782/21787 gets data from https://api.github.com/repositories/596892/contributors?page=5
__main__/21782/21788 gets data from https://api.github.com/repositories/596892/contributors?page=6
__main__/21782/21789 gets data from https://api.github.com/repositories/596892/contributors?page=7
__main__/21782/21790 gets data from https://api.github.com/repositories/596892/contributors?page=8
Done
__main__/21782/21790 gets data from https://api.github.com/repositories/596892/contributors?page=9
Done
__main__/21782/21788 gets data from https://api.github.com/repositories/596892/contributors?page=10
Done
Done
__main__/21782/21784 gets data from https://api.github.com/repositories/596892/contributors?page=11
__main__/21782/21785 gets data from https://api.github.com/repositories/596892/contributors?page=12
Done
Done
__main__/21782/21789 gets data from https://api.github.com/repositories/596892/contributors?page=13
__main__/21782/21787 gets data from https://api.github.com/repositories/596892/contributors?page=14
Done
Done
Done
Done
Done
Done
Done
Done
It took 19.567875146865845s in total.
~~~~


~~~bash
PID    COMMAND      %CPU  TIME     #TH    #WQ   #PORT MEM    PURG   CMPRS  PGRP  PPID  STATE    BOOSTS           %CPU_ME %CPU_OTHRS UID  FAULTS    COW      MSGSENT    MSGRECV
21790  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5340      1819     81         34
21789  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5353      1730     81         34
21788  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5323      1725     81         34
21787  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5350      1762     81         34
21786  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5342      1746     81         34
21785  python3.6    0.0   00:00.04 4      2     31    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5358      1794     81         34
21784  python3.6    0.0   00:00.04 5      3     33    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5328      1799     81         34
21783  python3.6    0.0   00:00.04 5      3     32    12M    0B     0B     21782 21782 sleeping *0[2]            0.00000 0.00000    501  5344      1889     81         34
21782  python3.6    0.1   00:00.26 4      0     17    19M    0B     0B     21782 21556 sleeping *0[1]            0.00000 0.00000    501  8136      2015     59         26
~~~

## Let the OS do this.



```python
import os
import sys
import time
import requests
import api_keys


HEADER = {'Authorization': f'token {api_keys.GITHUB_API_KEY}'}


def hard_work(a_url):
    sys.stdout.write(f'{__name__}/{os.getppid()}/{os.getpid()} gets data from {a_url}\n')
    r = requests.get(a_url, headers=HEADER)
    time.sleep(3)
    sys.stdout.write('Done')
    return [(contrib['login'], contrib['contributions'],
             contrib['html_url']) for contrib in r.json()]


if __name__ == '__main__':
    sys.stdout.write(str(hard_work(sys.argv[1])))
```

~~~bash
#!/bin/bash
for url in 'https://api.github.com/repositories/596892/contributors?page=1' 'https://api.github.com/repositories/596892/contributors?page=2' 'https://api.github.com/repositories/596892/contributors?page=3' 'https://api.github.com/repositories/596892/contributors?page=4' 'https://api.github.com/repositories/596892/contributors?page=5' 'https://api.github.com/repositories/596892/contributors?page=6' 'https://api.github.com/repositories/596892/contributors?page=7' 'https://api.github.com/repositories/596892/contributors?page=8' 'https://api.github.com/repositories/596892/contributors?page=9' 'https://api.github.com/repositories/596892/contributors?page=10' 'https://api.github.com/repositories/596892/contributors?page=11' 'https://api.github.com/repositories/596892/contributors?page=12' 'https://api.github.com/repositories/596892/contributors?page=13' 'https://api.github.com/repositories/596892/contributors?page=14'

do
    echo "Started python ./hard_work.py ${url}"
    nohup python ./hard_work.py ${url} </dev/null >> output.log 2>&1 &
done
~~~

Read the official docs for more information on for example how to share data between processes: https://docs.python.org/3.6/library/multiprocessing.html

#  An intro to generators

In [59]:
[i for i in range(1000, 500, -5)]

In [64]:
(i for i in range(1000, 500, -5))

<generator object <genexpr> at 0x109f6d990>

In [65]:
def gen_my_values():
    for i in range(1000, 500, -5):
        yield i

In [66]:
g = gen_my_values()
print(next(g))
print(next(g))
print(next(g))

1000
995
990


In [None]:
for i in gen_my_values():
    print(i)

### A practical application, reading large files

When reading files line-wise, which you cannot fit or which you do not want to fit into memory you can do something like the following:




In [43]:
import os
from memory_profiler import profile


@profile
def read_linewise(path):
    with open(path) as fp:
        for line in fp:
            yield line


@profile
def read_complete(path):
    with open(path) as fp:
        return fp.readlines()


@profile
def print_file_contents():
    for line in read_linewise('moby_dick.txt'):
        print(line, end='')


if __name__ == '__main__':
    if not os.path.isfile('moby_dick.txt'):
        os.system('wget -O moby_dick.txt http://www.gutenberg.org/files/2701/2701-0.txt')
    print_file_contents()


~~~bash
$ python -m memory_profiler generator_example.py
Filename: generator_example.py

Line #    Mem usage    Increment   Line Contents
================================================
    18     44.8 MiB     44.8 MiB   @profile
    19                             def print_file_contents():
    20     44.9 MiB      0.2 MiB       for line in read_linewise('moby_dick.txt'):
    21     44.9 MiB      0.0 MiB           print(line, end='')
~~~

~~~bash
$ python -m memory_profiler generator_example.py
Filename: generator_example.py

Line #    Mem usage    Increment   Line Contents
================================================
    18     44.8 MiB     44.8 MiB   @profile
    19                             def print_file_contents():
    20     47.7 MiB      2.8 MiB       for line in read_complete('moby_dick.txt'):
    21     47.7 MiB      0.0 MiB           print(line, end='')
~~~