# Bearbeitung des Datensatzes "Inventory" der Seattle Public Library mit Jupyter Notebook

Aus dem Datensatz "Inventory" sollen in diesem Teil der Aufgabe die Bestandszahlen der Zentralbibliothek betrachtet werden:

- Wie hoch war der Bestand an Nonfiction-Büchern in der Zentralbibliothek der Seattle Public Library im Jahr 2018?

Eine Einschränkung hierbei ist, dass die Zeilenmenge für diese Aufgabe auf 3.000.000 begrenzt wird.
Darüber hinaus wird der Sammlungstyp auf "CA-Nonfiction" eingeschränkt.

## Frage 5: Wie hoch war der Bestand an Nonfiction-Büchern in der Zentralbibliothek der Seattle Public Library im Jahr 2018?

In [2]:
# Zunächst laden wir uns die benötigten Bibliotheken herunter
import urllib.request

In [3]:
import pandas as pd

In [29]:
import numpy as np

In [4]:
# Wir legen vor dem Download die Variablen für die url und die csv-Datei fest. Das Limit legen wir auch hier auf 3.000.000.
url="https://data.seattle.gov/resource/6vkj-f5xf.csv?$limit=3000000"
inv="inventory2018.csv"

In [5]:
# Wir können die Datei nun herunterladen
urllib.request.urlretrieve(url, inv)

('inventory2018.csv', <http.client.HTTPMessage at 0x22ae2a11e88>)

In [31]:
# Anschließend erstellen wir eine Variable zum Lesen der Datei und lassen sie uns anzeigen
invent=pd.read_csv(inv)
invent

Unnamed: 0,bibnum,title,author,isbn,publicationyear,publisher,subjects,itemtype,itemcollection,floatingitem,itemlocation,reportdate,itemcount
0,3011076,A tale of two friends / adapted by Ellie O'Rya...,"O'Ryan, Ellie","1481425730, 1481425749, 9781481425735, 9781481...",2014.,"Simon Spotlight,","Musicians Fiction, Bullfighters Fiction, Best ...",jcbk,ncrdr,Floating,qna,2017-09-01T00:00:00.000,1
1,2248846,"Naruto. Vol. 1, Uzumaki Naruto / story and art...","Kishimoto, Masashi, 1974-",1569319006,"2003, c1999.","Viz,","Ninja Japan Comic books strips etc, Comic book...",acbk,nycomic,,lcy,2017-09-01T00:00:00.000,1
2,3209270,"Peace, love & Wi-Fi : a ZITS treasury / by Jer...","Scott, Jerry, 1955-","144945867X, 9781449458676",2014.,"Andrews McMeel Publishing,",Duncan Jeremy Fictitious character Comic books...,acbk,nycomic,,bea,2017-09-01T00:00:00.000,1
3,1907265,The Paris pilgrims : a novel / Clancy Carlile.,"Carlile, Clancy, 1930-",0786706155,c1999.,"Carroll & Graf,","Hemingway Ernest 1899 1961 Fiction, Biographic...",acbk,cafic,,cen,2017-09-01T00:00:00.000,1
4,1644616,"Erotic by nature : a celebration of life, of l...",,094020813X,"1991, c1988.","Red Alder Books/Down There Press,","Erotic literature American, American literatur...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2999995,2831460,The Frank show / David Mackintosh.,"Mackintosh, David","1419703935, 9781419703935",2012.,"Abrams Books for Young Readers,","Grandfathers Juvenile fiction, Show and tell p...",jcbk,ncpic,,glk,2017-11-01T00:00:00.000,1
2999996,2983333,Eating dangerously : why the government can't ...,"Booth, Michael, 1965-","1442222662, 9781442222663",[2014],"Rowman & Littlefield,","Food poisoning United States Prevention, Food ...",acbk,canf,,cen,2017-11-01T00:00:00.000,2
2999997,3141286,How to make your money last : the indispensabl...,"Quinn, Jane Bryant","1476743762, 1476743770, 9781476743769, 9781476...",2016.,"Simon & Schuster,","Retirement income Planning, Retirees Finance P...",acbk,nanf,,cap,2017-11-01T00:00:00.000,1
2999998,2923232,"Sock art : bold, graphic knits for your feet /...","Janssen, Edelgard, 1938-","157076557X, 9781570765575",c2013.,"Trafalgar Square,","Knitting Patterns, Socks",acbk,canf,,cen,2017-11-01T00:00:00.000,2


In [7]:
# Wir lassen uns die Titel der einzelnen Spalten anzeigen
invent.columns

Index(['bibnum', 'title', 'author', 'isbn', 'publicationyear', 'publisher',
       'subjects', 'itemtype', 'itemcollection', 'floatingitem',
       'itemlocation', 'reportdate', 'itemcount'],
      dtype='object')

In [32]:
# Für unsere Frage sind die Spalten "itemcollection" und "itemlocation" relevant. Wir kürzen die Darstellung auf diese Spalten.
invent[["itemcollection","itemlocation"]]

Unnamed: 0,itemcollection,itemlocation
0,ncrdr,qna
1,nycomic,lcy
2,nycomic,bea
3,cafic,cen
4,canf,cen
...,...,...
2999995,ncpic,glk
2999996,canf,cen
2999997,nanf,cap
2999998,canf,cen


In [33]:
# Wir interessieren uns für den Bestand der Zentralbibliothek ("cen"). 
# Deshalb filtern wir die Titel mit dieser Itemlocation heraus und lassen uns dies zunächst als Liste anzeigen.
invlist=invent.itemlocation.str.contains(pat="cen")
invlist

0          False
1          False
2          False
3           True
4           True
           ...  
2999995    False
2999996     True
2999997    False
2999998     True
2999999     True
Name: itemlocation, Length: 3000000, dtype: bool

In [34]:
# Die Liste zeigen wir nun als Tabelle an
invcen=invent[invlist]
invcen

Unnamed: 0,bibnum,title,author,isbn,publicationyear,publisher,subjects,itemtype,itemcollection,floatingitem,itemlocation,reportdate,itemcount
3,1907265,The Paris pilgrims : a novel / Clancy Carlile.,"Carlile, Clancy, 1930-",0786706155,c1999.,"Carroll & Graf,","Hemingway Ernest 1899 1961 Fiction, Biographic...",acbk,cafic,,cen,2017-09-01T00:00:00.000,1
4,1644616,"Erotic by nature : a celebration of life, of l...",,094020813X,"1991, c1988.","Red Alder Books/Down There Press,","Erotic literature American, American literatur...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
5,1736505,Children of Cambodia's killing fields : memoir...,,"0300068395, 0300078730",c1997.,"Yale University Press,","Political atrocities Cambodia, Children Cambod...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
6,1749492,Anti-Zionism : analytical reflections / editor...,,091559773X,c1989.,"Amana Books,","Berger Elmer 1908 1996, Zionism Controversial ...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
12,2519097,Stop bullying now! [videorecording] : Take a s...,,,[2006?],U.S. Dept. of Health and Human Services ; Dept...,"Bullying in schools United States Prevention, ...",acdvd,cadvdnf,,cen,2017-09-01T00:00:00.000,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2999989,3286348,Who's that girl / Blair Thornburgh.,"Thornburgh, Blair","0062447777, 9780062447777",[2017],"HarperTeen, an imprint of HarperCollinsPublish...","High school students Fiction, Rock musicians F...",acbk,cyfic,,cen,2017-11-01T00:00:00.000,1
2999990,2918511,All alone / Kevin Henkes.,"Henkes, Kevin","0060541156, 0060541164, 9780060541156, 9780060...",c2003.,"Greenwillow Books/HarperCollins Publishers,",Solitude Juvenile fiction,jcbk,ccpic,,cen,2017-11-01T00:00:00.000,2
2999996,2983333,Eating dangerously : why the government can't ...,"Booth, Michael, 1965-","1442222662, 9781442222663",[2014],"Rowman & Littlefield,","Food poisoning United States Prevention, Food ...",acbk,canf,,cen,2017-11-01T00:00:00.000,2
2999998,2923232,"Sock art : bold, graphic knits for your feet /...","Janssen, Edelgard, 1938-","157076557X, 9781570765575",c2013.,"Trafalgar Square,","Knitting Patterns, Socks",acbk,canf,,cen,2017-11-01T00:00:00.000,2


In [35]:
# Auch diese Tabelle reduzieren wir wieder auf die relevanten Spalten.
invcen[["itemcollection","itemlocation"]]

Unnamed: 0,itemcollection,itemlocation
3,cafic,cen
4,canf,cen
5,canf,cen
6,canf,cen
12,cadvdnf,cen
...,...,...
2999989,cyfic,cen
2999990,ccpic,cen
2999996,canf,cen
2999998,canf,cen


In [36]:
# Wir vergeben eine Variable für diese Tabelle, um sie bei weiteren Schritten als Grundlage nutzen zu können.
invallcen=invcen[["itemcollection","itemlocation"]]
invallcen

Unnamed: 0,itemcollection,itemlocation
3,cafic,cen
4,canf,cen
5,canf,cen
6,canf,cen
12,cadvdnf,cen
...,...,...
2999989,cyfic,cen
2999990,ccpic,cen
2999996,canf,cen
2999998,canf,cen


In [50]:
# In der Spalte "Itemcollection" wollen wir nun den Typ CA-Nonfiction ("canf") herausfiltern.
nonfall=invallcen.itemcollection.str.contains(pat="canf")
nonfall

3          False
4           True
5           True
6           True
12         False
           ...  
2999989    False
2999990    False
2999996     True
2999998     True
2999999    False
Name: itemcollection, Length: 1206453, dtype: bool

In [59]:
# Wir übertragen dieses Ergebnis in eine Tabelle und legen für diese eine Variable fest
invcanf=invcen[nonfall]
invcanf

Unnamed: 0,bibnum,title,author,isbn,publicationyear,publisher,subjects,itemtype,itemcollection,floatingitem,itemlocation,reportdate,itemcount
4,1644616,"Erotic by nature : a celebration of life, of l...",,094020813X,"1991, c1988.","Red Alder Books/Down There Press,","Erotic literature American, American literatur...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
5,1736505,Children of Cambodia's killing fields : memoir...,,"0300068395, 0300078730",c1997.,"Yale University Press,","Political atrocities Cambodia, Children Cambod...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
6,1749492,Anti-Zionism : analytical reflections / editor...,,091559773X,c1989.,"Amana Books,","Berger Elmer 1908 1996, Zionism Controversial ...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
27,2352640,Morningstar guide to mutual funds : five-star ...,,"0471718327, 9780471718321",c2005.,"John Wiley & Sons ,",Mutual funds,acbk,canf,,cen,2017-09-01T00:00:00.000,1
38,11879,Mirror to the American past; a survey of Ameri...,"Williams, Hermann Warner, 1908-1974",0821204440,[1973],New York Graphic Society,"Genre painting American, Genre painting 18th c...",acbk,canf,,cen,2017-09-01T00:00:00.000,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2999978,81740,Contexts of the drama [compiled by] Richard Go...,"Goldstone, Richard Henry",,[1968],McGraw-Hill,Drama,acbk,canf,,cen,2017-11-01T00:00:00.000,1
2999980,1845598,"Aviation, transport services : agreement betwe...",Austria.,,[1998?],Dept. of State : For sale by the Supt. of Docs...,"Airlines Management International cooperation,...",acbk,canf,,cen,2017-11-01T00:00:00.000,1
2999983,196512,"Selections from Carlyle: Sartor Resartus, The ...","Carlyle, Thomas, 1795-1881",,[c1915],D.C. Heath & co.,,acbk,canf,,cen,2017-11-01T00:00:00.000,1
2999996,2983333,Eating dangerously : why the government can't ...,"Booth, Michael, 1965-","1442222662, 9781442222663",[2014],"Rowman & Littlefield,","Food poisoning United States Prevention, Food ...",acbk,canf,,cen,2017-11-01T00:00:00.000,2


In [58]:
# Nun reduzieren wir die Tabelle auf die für uns relevanten Spalten
canfcen=invcanf[{"itemcollection","itemlocation"}]
canfcen

Unnamed: 0,itemlocation,itemcollection
4,cen,canf
5,cen,canf
6,cen,canf
27,cen,canf
38,cen,canf
...,...,...
2999978,cen,canf
2999980,cen,canf
2999983,cen,canf
2999996,cen,canf


Lösung 5: Unsere Ergebnistabelle enthält 428924 Zeilen. Demnach befinden sich (unter Berücksichtigung der von uns festgelegten Einschränkungen) __428924__ Titel des Sammlungstyps CA-Nonfiction im Bestand der Zentralbibliothek der Seattle Public Library.