# Looted Benin Art Work Distribution

Scrape <a href="https://digitalbenin.org/">the Benin site</a> to create a dataframe that contains the following scraped information about each institution:

- Museum name
- Country
- Number of disputed items

Export as a ```disputed-benin-artwork.csv```

In [1]:
## import library
import pandas as pd
import requests # es lo que trae todo el html del servidor
from bs4 import BeautifulSoup

In [2]:
## Requesting web content
url = "https://digitalbenin.org/institutions"
##scrape url website
response = requests.get(url)

In [3]:
## did it work?
response.status_code

200

In [4]:
## Create the soup
soup = BeautifulSoup(response.text, 'html.parser') #Esto vuelver a convertir el texto en un formato html, para scrapear

In [5]:
## prettify our printout
#print(soup.prettify())

In [6]:
## Return all institutions name and url
institutions = soup.find_all("a", class_ = "fs-5") 

In [7]:
len(institutions)

131

In [8]:
## return institutions names
institution_name = [institution.get_text() for institution in institutions]

In [9]:
## return institutions urls
institutions_url = ["https://digitalbenin.org" + institution.get('href') for institution in institutions]

In [10]:
## Return all institutions country
countries = soup.find_all("span", class_ = "badge")

In [11]:
## Return institutions country
countries_lc = [country.get_text() for country in countries]

In [12]:
## Return all disputed items
items = soup.find_all("div", class_ = "object_count")

In [13]:
## Return disputed items
items_lc = [item.get_text() for item in items]

In [14]:
## create a list that zips all items together.  
disputed_items = []
for item in zip (institution_name, countries_lc, items_lc, institutions_url):
    disputed_items.append(item)
disputed_items

[('British Museum',
  'United Kingdom',
  '944',
  'https://digitalbenin.org/institutions/5'),
 ('Ethnologisches Museum, Staatliche Museen zu Berlin',
  'Germany',
  '518',
  'https://digitalbenin.org/institutions/13'),
 ('Field Museum',
  'United States',
  '393',
  'https://digitalbenin.org/institutions/15'),
 ('Museum of Archaeology and Anthropology, University of Cambridge',
  'United Kingdom',
  '350',
  'https://digitalbenin.org/institutions/28'),
 ('National Museum, Benin',
  'Nigeria',
  '285',
  'https://digitalbenin.org/institutions/36'),
 ('Staatliche Ethnographische Sammlungen Sachsen und Staatliche Kunstsammlungen Dresden',
  'Germany',
  '283',
  'https://digitalbenin.org/institutions/47'),
 ('Weltmuseum Wien',
  'Austria',
  '202',
  'https://digitalbenin.org/institutions/50'),
 ('University of Pennsylvania Museum of Archaeology and Anthropology (Penn Museum)',
  'United States',
  '188',
  'https://digitalbenin.org/institutions/49'),
 ('MARKK Museum am Rothenbaum Kultur

In [15]:
len(disputed_items)

131

In [16]:
df = pd.DataFrame(disputed_items)
df.columns = ["Institution", "Country", "Disputed items", "More Info"]
df

Unnamed: 0,Institution,Country,Disputed items,More Info
0,British Museum,United Kingdom,944,https://digitalbenin.org/institutions/5
1,"Ethnologisches Museum, Staatliche Museen zu Be...",Germany,518,https://digitalbenin.org/institutions/13
2,Field Museum,United States,393,https://digitalbenin.org/institutions/15
3,"Museum of Archaeology and Anthropology, Univer...",United Kingdom,350,https://digitalbenin.org/institutions/28
4,"National Museum, Benin",Nigeria,285,https://digitalbenin.org/institutions/36
...,...,...,...,...
126,Speed Art Museum,United States,1,https://digitalbenin.org/institutions/345
127,"Niedersächsische Landesmuseen Oldenburg, Lande...",Germany,1,https://digitalbenin.org/institutions/346
128,Hull Museums,United Kingdom,1,https://digitalbenin.org/institutions/69
129,Great North Museum: Hancock,United Kingdom,1,https://digitalbenin.org/institutions/84


In [17]:
## use pandas to write to csv file
df.to_csv("disputed-benin-artwork.csv", index = False, encoding = "UTF-8")