# Lab: API

In [87]:
import requests
import pandas as pd
import json
import re
import random
from time import sleep

## API 1: Gutendex

The first thing: there's an article I'd really like to write and get published, based on the following remark by T.S. Eliot:

"Comparison and analysis \[...\] are the chief tools of the critic. It is obvious indeed that they are tools, to be handled with care, and not employed in an inquiry into **the number of times giraffes are mentioned in the English novel**" (T.S. Eliot, in *The Function of Criticism* (1923), reprinted in *Selected Essays*, pp.32-33)

So what I'd like to do is get as many English novels (full-text) as I can, and see how often giraffes are mentioned in them, and do some analysis on that - are there any patterns? Any periods when giraffes are particularly (un)popular? Etc. 

The Gutendex API seems like a good place to start, since Project Gutenberg has many full-text books. If I can get a list of English novels, that can give me a place to start downloading full-text books to see if giraffes are mentioned in them.

In [30]:
gutendex = requests.get("https://gutendex.com/books?languages=en&subjects=Fiction")

In [34]:
print(gutendex.status_code)

200


In [79]:
result=gutendex.json()

In [80]:
books = pd.json_normalize(result['results'])

In [81]:
books

Unnamed: 0,id,title,authors,translators,subjects,bookshelves,languages,copyright,media_type,download_count,...,formats.application/x-mobipocket-ebook,formats.application/rdf+xml,formats.text/html,formats.application/epub+zip,formats.text/plain; charset=us-ascii,formats.application/octet-stream,formats.text/plain,formats.text/html; charset=utf-8,formats.text/plain; charset=utf-8,formats.text/html; charset=iso-8859-1
0,2641,A Room with a View,"[{'name': 'Forster, E. M. (Edward Morgan)', 'b...",[],"[British -- Italy -- Fiction, England -- Ficti...",[Italy],[en],False,Text,82903,...,https://www.gutenberg.org/ebooks/2641.kindle.i...,https://www.gutenberg.org/ebooks/2641.rdf,https://www.gutenberg.org/files/2641/2641-h/26...,https://www.gutenberg.org/ebooks/2641.epub3.im...,https://www.gutenberg.org/files/2641/2641-0.txt,https://www.gutenberg.org/files/2641/2641-0.zip,,,,
1,16389,The Enchanted April,"[{'name': 'Von Arnim, Elizabeth', 'birth_year'...",[],"[British -- Italy -- Fiction, Domestic fiction...","[Bestsellers, American, 1895-1923]",[en],False,Text,70806,...,https://www.gutenberg.org/ebooks/16389.kindle....,https://www.gutenberg.org/ebooks/16389.rdf,https://www.gutenberg.org/files/16389/16389-h/...,https://www.gutenberg.org/ebooks/16389.epub3.i...,https://www.gutenberg.org/files/16389/16389-0.txt,https://www.gutenberg.org/files/16389/16389-0.zip,https://www.gutenberg.org/ebooks/16389.txt.utf-8,,,
2,1342,Pride and Prejudice,"[{'name': 'Austen, Jane', 'birth_year': 1775, ...",[],"[Courtship -- Fiction, Domestic fiction, Engla...","[Best Books Ever Listings, Harvard Classics]",[en],False,Text,57845,...,https://www.gutenberg.org/ebooks/1342.kindle.i...,https://www.gutenberg.org/ebooks/1342.rdf,https://www.gutenberg.org/ebooks/1342.html.images,https://www.gutenberg.org/ebooks/1342.epub.images,,,,https://www.gutenberg.org/files/1342/1342-h/13...,https://www.gutenberg.org/files/1342/1342-0.txt,
3,84,"Frankenstein; Or, The Modern Prometheus","[{'name': 'Shelley, Mary Wollstonecraft', 'bir...",[],[Frankenstein's monster (Fictitious character)...,"[Gothic Fiction, Movie Books, Precursors of Sc...",[en],False,Text,46147,...,https://www.gutenberg.org/ebooks/84.kindle.images,https://www.gutenberg.org/ebooks/84.rdf,https://www.gutenberg.org/ebooks/84.html.images,https://www.gutenberg.org/ebooks/84.epub.images,,,,https://www.gutenberg.org/files/84/84-h/84-h.htm,https://www.gutenberg.org/files/84/84-0.txt,
4,145,Middlemarch,"[{'name': 'Eliot, George', 'birth_year': 1819,...",[],"[Bildungsromans, City and town life -- Fiction...","[Best Books Ever Listings, Historical Fiction]",[en],False,Text,39554,...,https://www.gutenberg.org/ebooks/145.kindle.im...,https://www.gutenberg.org/ebooks/145.rdf,https://www.gutenberg.org/files/145/145-h/145-...,https://www.gutenberg.org/ebooks/145.epub.images,https://www.gutenberg.org/files/145/145-0.txt,https://www.gutenberg.org/files/145/145-0.zip,https://www.gutenberg.org/ebooks/145.txt.utf-8,,,
5,394,Cranford,"[{'name': 'Gaskell, Elizabeth Cleghorn', 'birt...",[],"[England -- Fiction, Female friendship -- Fict...",[],[en],False,Text,36701,...,https://www.gutenberg.org/ebooks/394.kindle.im...,https://www.gutenberg.org/ebooks/394.rdf,https://www.gutenberg.org/files/394/394-h/394-...,https://www.gutenberg.org/ebooks/394.epub.images,https://www.gutenberg.org/files/394/394-0.txt,https://www.gutenberg.org/files/394/394-0.zip,https://www.gutenberg.org/ebooks/394.txt.utf-8,,,
6,67979,The Blue Castle: a novel,"[{'name': 'Montgomery, L. M. (Lucy Maud)', 'bi...",[],"[Canada -- History -- 1914-1945 -- Fiction, Ch...",[],[en],False,Text,33335,...,https://www.gutenberg.org/ebooks/67979.kindle....,https://www.gutenberg.org/ebooks/67979.rdf,https://www.gutenberg.org/files/67979/67979-h/...,https://www.gutenberg.org/ebooks/67979.epub.im...,https://www.gutenberg.org/files/67979/67979-0.txt,https://www.gutenberg.org/files/67979/67979-0.zip,https://www.gutenberg.org/ebooks/67979.txt.utf-8,,,
7,1661,The Adventures of Sherlock Holmes,"[{'name': 'Doyle, Arthur Conan', 'birth_year':...",[],"[Detective and mystery stories, English, Holme...","[Banned Books from Anne Haight's list, Contemp...",[en],False,Text,28322,...,https://www.gutenberg.org/ebooks/1661.kindle.i...,https://www.gutenberg.org/ebooks/1661.rdf,https://www.gutenberg.org/ebooks/1661.html.images,https://www.gutenberg.org/ebooks/1661.epub.images,,,,https://www.gutenberg.org/files/1661/1661-h/16...,https://www.gutenberg.org/files/1661/1661-0.txt,
8,11,Alice's Adventures in Wonderland,"[{'name': 'Carroll, Lewis', 'birth_year': 1832...",[],[Alice (Fictitious character from Carroll) -- ...,[Children's Literature],[en],False,Text,27088,...,https://www.gutenberg.org/ebooks/11.kindle.images,https://www.gutenberg.org/ebooks/11.rdf,https://www.gutenberg.org/ebooks/11.html.images,https://www.gutenberg.org/ebooks/11.epub.images,,,,https://www.gutenberg.org/files/11/11-h/11-h.htm,https://www.gutenberg.org/files/11/11-0.txt,
9,1952,The Yellow Wallpaper,"[{'name': 'Gilman, Charlotte Perkins', 'birth_...",[],"[Feminist fiction, Married women -- Psychology...",[Gothic Fiction],[en],False,Text,26648,...,https://www.gutenberg.org/ebooks/1952.kindle.i...,https://www.gutenberg.org/ebooks/1952.rdf,https://www.gutenberg.org/ebooks/1952.html.images,https://www.gutenberg.org/ebooks/1952.epub.images,,,,https://www.gutenberg.org/files/1952/1952-h/19...,https://www.gutenberg.org/files/1952/1952-0.txt,


That gives me 32 results - I can gather more by doing more requests and adding page numbers 
(starting with `https://gutendex.com/books/?languages=en&page=2&subjects=Fiction`). Which I definitely will do in the near future. 
Also, I can filter out non-English books by eliminating all the rows that have a non-empty value in column 'translators': anything translated does not fall under T.S. Eliot's original remark.

### Getting those results

By fiddling around with the url I found out that there are a total of 1744 pages. Not all of them are fiction, but I'll do a brute force-ish thing where I just get the 1744 data into my notebook and then filter out what I don't want. 

In [88]:
for i in range(2,1745):
    print("Working on page ", str(i))
    page = f"https://gutendex.com/books/?languages=en&page={i}&subjects=Fiction"
    result = requests.get(page)
    jsonified = result.json()
    json_pretty = pd.json_normalize(jsonified['results'])
    books = pd.concat([books, json_pretty], axis = 0).copy()
    
    wait_time = random.randint(1,4000)
    print("I will sleep for " + str(wait_time/2000) + " seconds.")
    sleep(wait_time/2000)

Working on page  2
I will sleep for 0.881 seconds.
Working on page  3
I will sleep for 0.552 seconds.
Working on page  4
I will sleep for 1.4045 seconds.
Working on page  5
I will sleep for 0.1335 seconds.
Working on page  6
I will sleep for 0.564 seconds.
Working on page  7
I will sleep for 0.18 seconds.
Working on page  8
I will sleep for 0.467 seconds.
Working on page  9
I will sleep for 1.24 seconds.
Working on page  10
I will sleep for 0.1605 seconds.
Working on page  11
I will sleep for 0.1425 seconds.
Working on page  12
I will sleep for 1.583 seconds.
Working on page  13
I will sleep for 0.927 seconds.
Working on page  14
I will sleep for 1.879 seconds.
Working on page  15
I will sleep for 0.6645 seconds.
Working on page  16
I will sleep for 1.0055 seconds.
Working on page  17
I will sleep for 1.0125 seconds.
Working on page  18
I will sleep for 0.678 seconds.
Working on page  19
I will sleep for 0.26 seconds.
Working on page  20
I will sleep for 0.3465 seconds.
Working on page

I will sleep for 0.634 seconds.
Working on page  158
I will sleep for 0.385 seconds.
Working on page  159
I will sleep for 0.255 seconds.
Working on page  160
I will sleep for 0.338 seconds.
Working on page  161
I will sleep for 0.081 seconds.
Working on page  162
I will sleep for 0.2935 seconds.
Working on page  163
I will sleep for 1.762 seconds.
Working on page  164
I will sleep for 1.689 seconds.
Working on page  165
I will sleep for 0.3805 seconds.
Working on page  166
I will sleep for 0.4785 seconds.
Working on page  167
I will sleep for 0.215 seconds.
Working on page  168
I will sleep for 0.9495 seconds.
Working on page  169
I will sleep for 1.43 seconds.
Working on page  170
I will sleep for 1.6005 seconds.
Working on page  171
I will sleep for 1.5045 seconds.
Working on page  172
I will sleep for 1.6845 seconds.
Working on page  173
I will sleep for 1.9995 seconds.
Working on page  174
I will sleep for 1.704 seconds.
Working on page  175
I will sleep for 1.497 seconds.
Working

Working on page  311
I will sleep for 0.0405 seconds.
Working on page  312
I will sleep for 0.88 seconds.
Working on page  313
I will sleep for 1.963 seconds.
Working on page  314
I will sleep for 0.005 seconds.
Working on page  315
I will sleep for 0.4605 seconds.
Working on page  316
I will sleep for 0.2165 seconds.
Working on page  317
I will sleep for 0.291 seconds.
Working on page  318
I will sleep for 0.6045 seconds.
Working on page  319
I will sleep for 0.039 seconds.
Working on page  320
I will sleep for 1.565 seconds.
Working on page  321
I will sleep for 1.1885 seconds.
Working on page  322
I will sleep for 1.434 seconds.
Working on page  323
I will sleep for 0.3395 seconds.
Working on page  324
I will sleep for 1.2065 seconds.
Working on page  325
I will sleep for 0.128 seconds.
Working on page  326
I will sleep for 0.4165 seconds.
Working on page  327
I will sleep for 1.494 seconds.
Working on page  328
I will sleep for 1.1385 seconds.
Working on page  329
I will sleep for 

I will sleep for 1.9835 seconds.
Working on page  465
I will sleep for 0.0895 seconds.
Working on page  466
I will sleep for 1.877 seconds.
Working on page  467
I will sleep for 1.7795 seconds.
Working on page  468
I will sleep for 1.526 seconds.
Working on page  469
I will sleep for 0.675 seconds.
Working on page  470
I will sleep for 0.328 seconds.
Working on page  471
I will sleep for 1.2705 seconds.
Working on page  472
I will sleep for 1.1125 seconds.
Working on page  473
I will sleep for 0.7065 seconds.
Working on page  474
I will sleep for 1.703 seconds.
Working on page  475
I will sleep for 1.6725 seconds.
Working on page  476
I will sleep for 0.633 seconds.
Working on page  477
I will sleep for 0.9725 seconds.
Working on page  478
I will sleep for 0.2155 seconds.
Working on page  479
I will sleep for 1.1795 seconds.
Working on page  480
I will sleep for 1.6605 seconds.
Working on page  481
I will sleep for 1.401 seconds.
Working on page  482
I will sleep for 0.184 seconds.
Wor

Working on page  618
I will sleep for 1.975 seconds.
Working on page  619
I will sleep for 1.4695 seconds.
Working on page  620
I will sleep for 1.7695 seconds.
Working on page  621
I will sleep for 1.957 seconds.
Working on page  622
I will sleep for 1.0515 seconds.
Working on page  623
I will sleep for 1.0785 seconds.
Working on page  624
I will sleep for 0.524 seconds.
Working on page  625
I will sleep for 1.327 seconds.
Working on page  626
I will sleep for 1.4415 seconds.
Working on page  627
I will sleep for 0.6875 seconds.
Working on page  628
I will sleep for 1.252 seconds.
Working on page  629
I will sleep for 0.5335 seconds.
Working on page  630
I will sleep for 1.3675 seconds.
Working on page  631
I will sleep for 0.771 seconds.
Working on page  632
I will sleep for 0.761 seconds.
Working on page  633
I will sleep for 0.0215 seconds.
Working on page  634
I will sleep for 0.738 seconds.
Working on page  635
I will sleep for 1.784 seconds.
Working on page  636
I will sleep for

Working on page  771
I will sleep for 0.5505 seconds.
Working on page  772
I will sleep for 1.2225 seconds.
Working on page  773
I will sleep for 1.571 seconds.
Working on page  774
I will sleep for 1.6755 seconds.
Working on page  775
I will sleep for 1.8075 seconds.
Working on page  776
I will sleep for 1.5745 seconds.
Working on page  777
I will sleep for 0.626 seconds.
Working on page  778
I will sleep for 0.2715 seconds.
Working on page  779
I will sleep for 0.7115 seconds.
Working on page  780
I will sleep for 1.748 seconds.
Working on page  781
I will sleep for 0.24 seconds.
Working on page  782
I will sleep for 1.689 seconds.
Working on page  783
I will sleep for 1.9815 seconds.
Working on page  784
I will sleep for 0.196 seconds.
Working on page  785
I will sleep for 1.252 seconds.
Working on page  786
I will sleep for 0.738 seconds.
Working on page  787
I will sleep for 0.884 seconds.
Working on page  788
I will sleep for 1.522 seconds.
Working on page  789
I will sleep for 1

I will sleep for 0.905 seconds.
Working on page  925
I will sleep for 1.8165 seconds.
Working on page  926
I will sleep for 0.0985 seconds.
Working on page  927
I will sleep for 1.7055 seconds.
Working on page  928
I will sleep for 0.1935 seconds.
Working on page  929
I will sleep for 0.5985 seconds.
Working on page  930
I will sleep for 1.035 seconds.
Working on page  931
I will sleep for 1.325 seconds.
Working on page  932
I will sleep for 1.6815 seconds.
Working on page  933
I will sleep for 1.223 seconds.
Working on page  934
I will sleep for 0.141 seconds.
Working on page  935
I will sleep for 0.826 seconds.
Working on page  936
I will sleep for 0.864 seconds.
Working on page  937
I will sleep for 0.2685 seconds.
Working on page  938
I will sleep for 0.224 seconds.
Working on page  939
I will sleep for 1.0265 seconds.
Working on page  940
I will sleep for 1.1115 seconds.
Working on page  941
I will sleep for 1.282 seconds.
Working on page  942
I will sleep for 1.271 seconds.
Worki

I will sleep for 0.5515 seconds.
Working on page  1077
I will sleep for 1.9205 seconds.
Working on page  1078
I will sleep for 1.133 seconds.
Working on page  1079
I will sleep for 1.7565 seconds.
Working on page  1080
I will sleep for 0.352 seconds.
Working on page  1081
I will sleep for 0.7735 seconds.
Working on page  1082
I will sleep for 1.7685 seconds.
Working on page  1083
I will sleep for 0.3445 seconds.
Working on page  1084
I will sleep for 1.7875 seconds.
Working on page  1085
I will sleep for 0.6275 seconds.
Working on page  1086
I will sleep for 1.812 seconds.
Working on page  1087
I will sleep for 0.705 seconds.
Working on page  1088
I will sleep for 1.756 seconds.
Working on page  1089
I will sleep for 1.4945 seconds.
Working on page  1090
I will sleep for 0.1415 seconds.
Working on page  1091
I will sleep for 0.1645 seconds.
Working on page  1092
I will sleep for 0.3535 seconds.
Working on page  1093
I will sleep for 0.368 seconds.
Working on page  1094
I will sleep for

Working on page  1227
I will sleep for 1.0555 seconds.
Working on page  1228
I will sleep for 0.9555 seconds.
Working on page  1229
I will sleep for 1.6895 seconds.
Working on page  1230
I will sleep for 1.948 seconds.
Working on page  1231
I will sleep for 0.3335 seconds.
Working on page  1232
I will sleep for 1.7235 seconds.
Working on page  1233
I will sleep for 1.553 seconds.
Working on page  1234
I will sleep for 1.4145 seconds.
Working on page  1235
I will sleep for 0.246 seconds.
Working on page  1236
I will sleep for 0.4365 seconds.
Working on page  1237
I will sleep for 1.3115 seconds.
Working on page  1238
I will sleep for 1.845 seconds.
Working on page  1239
I will sleep for 1.548 seconds.
Working on page  1240
I will sleep for 0.677 seconds.
Working on page  1241
I will sleep for 0.0475 seconds.
Working on page  1242
I will sleep for 1.2155 seconds.
Working on page  1243
I will sleep for 0.731 seconds.
Working on page  1244
I will sleep for 1.0965 seconds.
Working on page  

I will sleep for 0.204 seconds.
Working on page  1378
I will sleep for 1.0845 seconds.
Working on page  1379
I will sleep for 0.359 seconds.
Working on page  1380
I will sleep for 0.7935 seconds.
Working on page  1381
I will sleep for 0.5945 seconds.
Working on page  1382
I will sleep for 1.2615 seconds.
Working on page  1383
I will sleep for 0.59 seconds.
Working on page  1384
I will sleep for 1.2555 seconds.
Working on page  1385
I will sleep for 0.575 seconds.
Working on page  1386
I will sleep for 0.6295 seconds.
Working on page  1387
I will sleep for 0.415 seconds.
Working on page  1388
I will sleep for 0.961 seconds.
Working on page  1389
I will sleep for 0.487 seconds.
Working on page  1390
I will sleep for 0.2415 seconds.
Working on page  1391
I will sleep for 1.0285 seconds.
Working on page  1392
I will sleep for 0.0475 seconds.
Working on page  1393
I will sleep for 0.308 seconds.
Working on page  1394
I will sleep for 1.2405 seconds.
Working on page  1395
I will sleep for 1.

Working on page  1528
I will sleep for 1.6665 seconds.
Working on page  1529
I will sleep for 0.4895 seconds.
Working on page  1530
I will sleep for 0.5685 seconds.
Working on page  1531
I will sleep for 1.774 seconds.
Working on page  1532
I will sleep for 0.3795 seconds.
Working on page  1533
I will sleep for 1.784 seconds.
Working on page  1534
I will sleep for 0.1045 seconds.
Working on page  1535
I will sleep for 1.36 seconds.
Working on page  1536
I will sleep for 1.5865 seconds.
Working on page  1537
I will sleep for 1.571 seconds.
Working on page  1538
I will sleep for 0.9115 seconds.
Working on page  1539
I will sleep for 1.471 seconds.
Working on page  1540
I will sleep for 1.2125 seconds.
Working on page  1541
I will sleep for 0.3805 seconds.
Working on page  1542
I will sleep for 0.5605 seconds.
Working on page  1543
I will sleep for 0.1715 seconds.
Working on page  1544
I will sleep for 0.046 seconds.
Working on page  1545
I will sleep for 0.1745 seconds.
Working on page  

Working on page  1679
I will sleep for 0.869 seconds.
Working on page  1680
I will sleep for 0.553 seconds.
Working on page  1681
I will sleep for 0.8305 seconds.
Working on page  1682
I will sleep for 1.1905 seconds.
Working on page  1683
I will sleep for 1.129 seconds.
Working on page  1684
I will sleep for 1.483 seconds.
Working on page  1685
I will sleep for 1.2075 seconds.
Working on page  1686
I will sleep for 0.3815 seconds.
Working on page  1687
I will sleep for 1.7745 seconds.
Working on page  1688
I will sleep for 0.9815 seconds.
Working on page  1689
I will sleep for 1.429 seconds.
Working on page  1690
I will sleep for 1.4025 seconds.
Working on page  1691
I will sleep for 1.572 seconds.
Working on page  1692
I will sleep for 1.0595 seconds.
Working on page  1693
I will sleep for 1.4365 seconds.
Working on page  1694
I will sleep for 1.464 seconds.
Working on page  1695
I will sleep for 1.5905 seconds.
Working on page  1696
I will sleep for 0.2285 seconds.
Working on page  

In [97]:
books.tail(60)

Unnamed: 0,level_0,index,id,title,authors,translators,subjects,bookshelves,languages,copyright,...,formats.application/x-iso9660-image,formats.video/mpeg,formats.application/x-musescore,formats.video/quicktime,formats.video/x-msvideo,formats.image/tiff,formats.text/plain; charset=ibm850,formats.audio/x-wav,formats.audio/x-ms-wma,formats.text/plain; charset=big5
55756,55756,12,69043,Star book no. 46: Chair backs,"[{'name': 'Anonymous', 'birth_year': None, 'de...",[],[],[],[en],False,...,,,,,,,,,,
55757,55757,13,69044,The story of Ida: epitaph on an Etrurian tomb,"[{'name': 'Alexander, Francesca', 'birth_year'...",[],[],[],[en],False,...,,,,,,,,,,
55758,55758,14,69046,"Jewels and the woman: The romance, magic and a...","[{'name': 'Ostier, Marianne', 'birth_year': No...",[],[],[],[en],False,...,,,,,,,,,,
55759,55759,15,69047,The Tiddly Winks,"[{'name': 'Smith, Laura Rountree', 'birth_year...",[],[],[],[en],False,...,,,,,,,,,,
55760,55760,16,69048,The boomerang circuit,"[{'name': 'Leinster, Murray', 'birth_year': 18...",[],[],[],[en],False,...,,,,,,,,,,
55761,55761,17,69049,The flowering plants of Africa: An analytical ...,"[{'name': 'Thonner, Franz', 'birth_year': None...",[],[],[],[en],False,...,,,,,,,,,,
55762,55762,18,69050,The coming,"[{'name': 'Snaith, J. C. (John Collis)', 'birt...",[],[],[],[en],False,...,,,,,,,,,,
55763,55763,19,69051,Romances of the old town of Edinburgh,"[{'name': 'Leighton, Alexander', 'birth_year':...",[],[],[],[en],False,...,,,,,,,,,,
55764,55764,20,69052,The ward of Tecumseh,"[{'name': 'Marriott, Crittenden', 'birth_year'...",[],[],[],[en],False,...,,,,,,,,,,
55765,55765,21,69053,The conservation of energy,"[{'name': 'Stewart, Balfour', 'birth_year': No...",[],[],[],[en],False,...,,,,,,,,,,


In [96]:
books.reset_index(inplace = True)

In [95]:
books.tail()

Unnamed: 0,index,id,title,authors,translators,subjects,bookshelves,languages,copyright,media_type,...,formats.application/x-iso9660-image,formats.video/mpeg,formats.application/x-musescore,formats.video/quicktime,formats.video/x-msvideo,formats.image/tiff,formats.text/plain; charset=ibm850,formats.audio/x-wav,formats.audio/x-ms-wma,formats.text/plain; charset=big5
55811,3,69119,"An outlaw's pledge: or, The raid on the old st...","[{'name': 'Dair, Col. Spencer', 'birth_year': ...",[],[],[],[en],False,Text,...,,,,,,,,,,
55812,4,69121,An outlaw's diary: revolution,"[{'name': 'Tormay, Cécile', 'birth_year': 1876...",[],[],[],[en],False,Text,...,,,,,,,,,,
55813,5,69124,The hellflower,"[{'name': 'Smith, George O. (George Oliver)', ...",[],[],[],[en],False,Text,...,,,,,,,,,,
55814,6,69125,The higher education of women,"[{'name': 'Davies, Emily', 'birth_year': None,...",[],[],[],[en],False,Text,...,,,,,,,,,,
55815,7,69126,"The works of Mr. Thomas Brown, serious and com...","[{'name': 'Fillebrown, Thomas', 'birth_year': ...",[],[],[],[en],False,Text,...,,,,,,,,,,


And store it into a csv so I don't have to go through this process ever again...

In [98]:
books.to_csv("booklist.csv")

## API 2: Edamam

Another thing I would be interested in is making a predictor for recipes - given a certain number of ingredients, can I recommend/predict additional ingredients? (This is also a potential final project). So I checked out the API for edamam:

The first search I did was for 'rice' - one of the most versatile ingredients. 

In [6]:
edamam = requests.get("https://api.edamam.com/api/recipes/v2?type=public&q=rice&app_id=f74e2e90&app_key=d7bfa6549679a0bf326334cb87253edf")

In [7]:
print(edamam.status_code)

200


In [100]:
result = edamam.json()

In [101]:
result

{'from': 1,
 'to': 20,
 'count': 10000,
 '_links': {'next': {'href': 'https://api.edamam.com/api/recipes/v2?q=rice&app_key=d7bfa6549679a0bf326334cb87253edf&_cont=CHcVQBtNNQphDmgVQntAEX4BYlNtDQQHQmRIAWIaa1Z7DAUBUXlSAzcWawEgAVcGQ2BGC2FGYFAgBwJWFTRCAzcQawdyVwMVLnlSVSBMPkd5BgMbUSYRVTdgMgksRlpSAAcRXTVGcV84SU4%3D&type=public&app_id=f74e2e90',
   'title': 'Next page'}},
 'hits': [{'recipe': {'uri': 'http://www.edamam.com/ontologies/edamam.owl#recipe_b1957a6a4025b25f6da6aef1fad452d4',
    'label': 'Essentials: Rice',
    'image': 'https://edamam-product-images.s3.amazonaws.com/web-img/b71/b716942f16e3e9490829f7da8dba509e.jpg?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEKn%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJHMEUCIQDm78deFie9mGUPu3MXapZoeKKro8AoZpbyxanYleVXvQIgECVivYYde0TIYtHMLHM4oocqaAcoQb1Vw4YS5pCruQAqzAQIchAAGgwxODcwMTcxNTA5ODYiDJjcyybWNm7g6EoaNSqpBBDo2PBqUCI9C1%2FhX1VZEN8U6uhhKoLleP09U9n6RNuDEYBsGfDf4JA61%2FrlzXfTWjhZqtIEZX7gaxflS%2BYYS09YNjgiMLX%2FHOS1zLODKoA77ROpUxQkJJGizNsAymtzSSkSvf

In [102]:
edamam_df = pd.json_normalize(result["hits"])

In [103]:
edamam_df

Unnamed: 0,recipe.uri,recipe.label,recipe.image,recipe.images.THUMBNAIL.url,recipe.images.THUMBNAIL.width,recipe.images.THUMBNAIL.height,recipe.images.SMALL.url,recipe.images.SMALL.width,recipe.images.SMALL.height,recipe.images.REGULAR.url,...,recipe.totalDaily.VITD.unit,recipe.totalDaily.TOCPHA.label,recipe.totalDaily.TOCPHA.quantity,recipe.totalDaily.TOCPHA.unit,recipe.totalDaily.VITK1.label,recipe.totalDaily.VITK1.quantity,recipe.totalDaily.VITK1.unit,recipe.digest,_links.self.title,_links.self.href
0,http://www.edamam.com/ontologies/edamam.owl#re...,Essentials: Rice,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,7.938201,%,Vitamin K,4.162715,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/b1957a6a...
1,http://www.edamam.com/ontologies/edamam.owl#re...,Rice Cereal Bars,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,8.785067,%,Vitamin K,3.313333,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/4078f5c0...
2,http://www.edamam.com/ontologies/edamam.owl#re...,Perfect Sushi Rice,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,0.0,%,Vitamin K,0.0,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/534ecb11...
3,http://www.edamam.com/ontologies/edamam.owl#re...,Rice-Milk Rice Pudding,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,0.0,%,Vitamin K,0.0,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/62f902aa...
4,http://www.edamam.com/ontologies/edamam.owl#re...,Sushi Rice Recipe,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,0.0,%,Vitamin K,0.0,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/e2044086...
5,http://www.edamam.com/ontologies/edamam.owl#re...,Cooked Basmati Rice,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,1.356667,%,Vitamin K,0.154167,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/44db99d3...
6,http://www.edamam.com/ontologies/edamam.owl#re...,Rainbow rice,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,53.0805,%,Vitamin K,47.204271,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/b2cb2273...
7,http://www.edamam.com/ontologies/edamam.owl#re...,Yellow Rice,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,16.111778,%,Vitamin K,0.574583,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/1aea4cf6...
8,http://www.edamam.com/ontologies/edamam.owl#re...,Basic Sushi Rice Recipe,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,0.0,%,Vitamin K,0.0,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/45b6b9a9...
9,http://www.edamam.com/ontologies/edamam.owl#re...,Sweet Cinnamon Rice Pudding,https://edamam-product-images.s3.amazonaws.com...,https://edamam-product-images.s3.amazonaws.com...,100,100,https://edamam-product-images.s3.amazonaws.com...,200,200,https://edamam-product-images.s3.amazonaws.com...,...,%,Vitamin E,15.090267,%,Vitamin K,0.8845,%,"[{'label': 'Fat', 'tag': 'FAT', 'schemaOrgTag'...",Self,https://api.edamam.com/api/recipes/v2/0942f9ae...


So this would need a *lot* of work before you can extract a list of ingredients. It looks like it would be a web scraping situation.

Although - maybe the recipe.digest column has something useful...

In [105]:
edamam_df['recipe.digest'].iloc[1]

[{'label': 'Fat',
  'tag': 'FAT',
  'schemaOrgTag': 'fatContent',
  'total': 47.0574704625,
  'hasRDI': True,
  'daily': 72.39610840384614,
  'unit': 'g',
  'sub': [{'label': 'Saturated',
    'tag': 'FASAT',
    'schemaOrgTag': 'saturatedFatContent',
    'total': 29.4449813295,
    'hasRDI': True,
    'daily': 147.2249066475,
    'unit': 'g'},
   {'label': 'Trans',
    'tag': 'FATRN',
    'schemaOrgTag': 'transFatContent',
    'total': 1.8619039999999996,
    'hasRDI': False,
    'daily': 0.0,
    'unit': 'g'},
   {'label': 'Monounsaturated',
    'tag': 'FAMS',
    'schemaOrgTag': None,
    'total': 12.166724185,
    'hasRDI': False,
    'daily': 0.0,
    'unit': 'g'},
   {'label': 'Polyunsaturated',
    'tag': 'FAPU',
    'schemaOrgTag': None,
    'total': 1.8616667586875,
    'hasRDI': False,
    'daily': 0.0,
    'unit': 'g'}]},
 {'label': 'Carbs',
  'tag': 'CHOCDF',
  'schemaOrgTag': 'carbohydrateContent',
  'total': 305.94770300625,
  'hasRDI': True,
  'daily': 101.98256766875,
  

## API 3: TheMealDB

See number 2 above for an explanation of what I hope to get from this.

In [54]:
mealdb = requests.get("https://www.themealdb.com/api/json/v1/1/list.php?i=list")

In [64]:
mealdb = mealdb.json()

In [65]:
pd.json_normalize(mealdb['meals'])

Unnamed: 0,idIngredient,strIngredient,strDescription,strType
0,1,Chicken,"The chicken is a type of domesticated fowl, a ...",
1,2,Salmon,Salmon is the common name for several species ...,
2,3,Beef,Beef is the culinary name for meat from cattle...,
3,4,Pork,Pork is the culinary name for the flesh of a d...,
4,5,Avocado,"The avocado, a tree with probable origin in So...",
...,...,...,...,...
569,603,Cider,Cider (/ˈsaɪdər/ SY-dər) is an alcoholic bever...,Drink
570,604,Beetroot,The beetroot is the taproot portion of a beet ...,Vegetable
571,605,Sardines,"""Sardine"" and ""pilchard"" are common names that...",Seafood
572,606,Ciabatta,Ciabatta is an Italian white bread made from w...,Bread


That's pretty cool. But it's just a list of the ingredients that exist - I want to see if I can get recipes as well.

In [67]:
beetroot = requests.get("https://www.themealdb.com/api/json/v1/1/filter.php?i=beetroot")

In [72]:
beetroot = beetroot.json()

AttributeError: 'dict' object has no attribute 'json'

In [73]:
beetroot

{'meals': None}

No recipes with beetroot. Let's try rice again.

In [74]:
rice = requests.get("https://www.themealdb.com/api/json/v1/1/filter.php?i=rice")

In [75]:
rice = rice.json()

In [76]:
rice

{'meals': [{'strMeal': 'Beef Banh Mi Bowls with Sriracha Mayo, Carrot & Pickled Cucumber',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/z0ageb1583189517.jpg',
   'idMeal': '52997'},
  {'strMeal': 'Chicken Congee',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/1529446352.jpg',
   'idMeal': '52956'},
  {'strMeal': 'Egyptian Fatteh',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/rlwcc51598734603.jpg',
   'idMeal': '53031'},
  {'strMeal': 'Gołąbki (cabbage roll)',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/q8sp3j1593349686.jpg',
   'idMeal': '53021'},
  {'strMeal': 'Kedgeree',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/utxqpt1511639216.jpg',
   'idMeal': '52887'},
  {'strMeal': 'Koshari',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/4er7mj1598733193.jpg',
   'idMeal': '53027'},
  {'strMeal': 'Nasi lemak',
   'strMealThumb': 'https://www.themealdb.com/images/media/meals/

That's not a whole lot of output. I wonder how large their database of recipes really is.

Looked at their website - only 285 meals. So while this is awesome, it's not going to be enough to make a sensible predictor/recommender for ingredients.