#NOME FIGO
Art and history of art are no sealed compartments: they are heavily inter-dependent with social, political, economic factors, which in turn influence our very perception of what art is.

Cultural institutions and museums in particular play a fundamental role in this intertwined dynamics: through their selection activity, they have the potential to shape the public understanding of arts and its modifications throughout time. 
In some way, what makes into museums makes into history of art.

From these considerations stems our analysis: how do external (social, political, economic) factors influence the perception of art and its history?
A way to investigate it is by looking at the greatest and most representative museums around the world, and at their acquisition policies and campaigns in particular.

Our key questions:
In which ways have the acquisition campaigns of the major museums in the world changed throughout the years? 


Our workflow:
1. Interrogate WikiData:
    - What are the biggest collections around the world?
2. Find csv files for some of the major museums.
3. Select some representative time slots (both internal and external factors).
4. Analyse acquisitions during these time slots for every museum and compare:
    a) Difference between different slots in the same museum;
    b) Difference between different museums for the same time slot;

Our questions:
- What was the initial nucleus of each museum? 
- Internal survey: Is there a significant date or decade for the acquisitions? 
- External survey: What are the acquisition trends around the Xs/between the x and the y? / What are the acquisition trends within and across these museums? 
- During these years, who are the most represented makers? What is the most represented gender? What is the most represented movement? What is the most represented nationality? 


We analysed 5/4 of the (MET, MoMa, N+, Cleveland?, Tate) 

Wikidata interrogation: failure.

1. What are the largest art collections?

SELECT ?museum (COUNT(?work) AS ?works) WHERE {
  ?work wdt:P195 ?museum.
  ?museum wdt:P31 wd:Q207694
  }
GROUP BY ?museum 
ORDER BY DESC(?works)

2.  Which were the most visited museums in 2018?

SELECT ?museumLabel ?visitors ?year
WHERE {
  ?museum wdt:P31 wd:Q207694;
          wdt:P1705 ?museumLabel;
          wdt:P1174 ?visitors;
          p:P1174/pq:P585 ?year .
FILTER(YEAR(?year) = 2018).
}

ORDER BY DESC(?visitors)

Since WikiData was not providing reliable results, we decided to go back to its sources (The Art Newspaper https://www.theartnewspaper.com/) and manually collect data about the most visited museums in the last four years(2018-2022).

https://onedrive.live.com/view.aspx?resid=E34DDE1A3F2F2160!138&ithint=file%2cxlsx&authkey=!AN4u-K4bko37iOU
    
We verified the availability of open datasets for each of the top 20 most visited museums on this GitHub repository (https://github.com/Ambrosiani/museums-on-github), containing a list of museums with GitHub accounts.

Our analysis led us to the decision to focus on four museums:
- Tate Modern, London
- MoMa, NY
- Met, NY
- National Gallery of Art, Washington DC

**Info generale sui musei.

In [3]:
import pandas as pd
import csv
import re

First: let us create some pandas dataframes containing all needed information: for each Museum, we will integrate different csv files, selecting the data we need for each of them. 

#MoMa 

In [53]:
spreadsheet = pd.read_csv('https://media.githubusercontent.com/media/MuseumofModernArt/collection/master/Artworks.csv')
pd.set_option('display.max_columns', None)
artworks = spreadsheet[['Title', 'Artist', 'ConstituentID', 'Nationality', 'BeginDate', 'EndDate', 'Gender', 'Date', 'Medium', 'CreditLine', 'Classification', 'Department', 'DateAcquired', 'URL']]
artists = pd.read_csv('https://media.githubusercontent.com/media/MuseumofModernArt/collection/master/Artists.csv')
artists["ConstituentID"] = artists["ConstituentID"].astype(str)
MoMa = pd.merge(artworks,artists[['ConstituentID', 'Wiki QID']],on='ConstituentID', how='left')
MoMa.rename(columns = {'ConstituentID':'Id', 'BeginDate':'BirthDate', 'EndDate':'DeathDate'}, inplace = True)
MoMa.Date = MoMa.Date.fillna('Not available')
MoMa['Date'] = MoMa['Date'].astype(str)
#MoMa.to_csv("MoMa.csv")

#Tate

In [4]:
spreadsheet = pd.read_csv('https://raw.githubusercontent.com/tategallery/collection/master/artwork_data.csv')
pd.set_option('display.max_columns', None)
artworks = spreadsheet[['artist', 'artistId', 'title', 'medium', 'creditLine', 'year', 'acquisitionYear', 'url']]
artworks.rename(columns = {'artistId':'id'}, inplace = True)
artworks.id = artworks.id.astype(str)
artists = pd.read_csv('https://raw.githubusercontent.com/tategallery/collection/master/artist_data.csv')
artists["id"] = artists["id"].astype(str)
Tate = pd.merge(artworks,artists[['id', 'gender', 'yearOfBirth', 'yearOfDeath']], on='id', how='left')
Tate.rename(columns = {'artist':'Artist', 'id':'Id', 'title':'Title', 'yearOfBirth':'BirthDate', 'yearOfDeath':'DeathDate', 'medium':'Medium', 'creditLine':'CreditLine', 'year':'Date', 'acquisitionYear':'DateAcquired', 'url':'URL', 'gender':'Gender'}, inplace = True)
Tate.to_csv("Tate.csv")

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value


#Met

In [4]:
spreadsheet = pd.read_csv('https://media.githubusercontent.com/media/metmuseum/openaccess/master/MetObjects.csv')
pd.set_option('display.max_columns', None)
Met = spreadsheet[['AccessionYear', 'Title', 'Culture', 'Artist Display Name', 'Artist Nationality', 'Artist Begin Date', 'Artist End Date', 'Artist Gender', 'Artist Wikidata URL', 'Object End Date', 'Medium', 'Credit Line', 'Classification', 'Link Resource', 'Object Wikidata URL']]
Met.rename(columns = {'Artist Display Name':'Artist', 'id':'Id', 'Artist Begin Date':'BirthDate', 'Artist End Date':'DeathDate', 'Credit Line':'CreditLine', 'Object End Date':'Date', 'AccessionYear':'DateAcquired', 'Artist Wikidata URL':'Wiki QID', 'Artist Gender':'Gender', 'Link Resource':'URL', 'Artist Nationality':'Nationality'}, inplace = True)
#Met.to_csv("Met.csv")

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(


#Nga

In [14]:
spreadsheet = pd.read_csv('https://raw.githubusercontent.com/NationalGalleryOfArt/opendata/main/data/objects.csv')
pd.set_option('display.max_columns', None)
Nga = spreadsheet[['accessionnum', 'title', 'endyear', 'medium', 'attribution', 'creditline', 'classification']]
Nga.rename(columns = {'attribution':'Artist', 'id':'Id', 'title':'Title', 'medium':'Medium', 'creditline':'CreditLine', 'endyear':'Date', 'accessionnum':'DateAcquired', 'classification':'Classification', 'Object End Date':'Date'}, inplace = True)
Nga.to_csv("Nga.csv")

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


#Exploring our Museums
<br>
Now that we have our dataframes, we can explore the four collections.
<br>
- How many items does each collection contain?
- Which timespan do items cover overall?
- First and last acquisition date for each museum. Tate's csv last update dates back to 2014.
- Total artists' number.
- Most represented artist, gender and nationality in general?

In [15]:
museums=[MoMa, Met, Tate, Nga]
names = ['Moma', 'Met', 'Tate', 'Nga']
for museum in museums:
    selected_rows = museum[~museum['Title'].isnull()]
    name = names.pop(0)
    print("Total items at", name, ":", len(selected_rows.index))

Total items at Moma : 139912
Total items at Met : 448619
Total items at Tate : 69201
Total items at Nga : 137923


#ARTWORKS DATES

#Clean MoMa's artworks' creation dates 

In [57]:
def cleanDatesMoma(date):
    if '-' in date:
        splitted = date.split('-')
        date = ' '.join(splitted) 
    if '/' in date:
        splitted = date.split('/')
        date = ' '.join(splitted) 
    if ',' in date:
        splitted = date.split(',')
        date = ' '.join(splitted) 
    if '.' in date:
        splitted = date.split('.')
        date = ' '.join(splitted) 
        
    x = re.search("\d{4}", date)
    if x:
        date = x.group()
    return date

In [58]:
MoMaNew = MoMa.copy(deep=True)
MoMaNew["Date"] = MoMaNew["Date"].apply(cleanDatesMoma)
MoMaNew.to_csv("MoMaNew.csv")
#MoMaNew.head(30)

Artworks' timespan

In [86]:
museums=['MoMaNew.csv','Nga.csv', 'Tate.csv', 'metclean2.csv']
names = ['MoMa','Nga', 'Tate', 'Met']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        years=[]
        for item in reader:
            if item['Date'] != ''and item['Date'] != '(n d )'and item['Date'] != 'TBD'and item['Date'] != 'nd'and item['Date']!='c  196?' and 'c.' not in item['Date'] and item['Date'] != 'no date' and item['Date'] != 'date of publicati' and item['Date'] != 'New York' and item['Date'] != 'Not available' and item['Date'] != 'Various' and item['Date'] != 'Various' and item['Date'] != 'unknown' and 'century' not in item['Date'] and item['Date'] != 'Unknown' and item['Date'] != 'n d ' and '(' not in item['Date'] and item['Date'] != 'n d ' and item['Date'] != 'n d' and item['Date'] != 'n  d ' and item['Date'] != 'Unkown' and item['Date'] != 'TBC':
                years.append(item['Date'])
        clean = []
        for el in years:
            if '.'in el:
                el = el.split('.')[0]
            clean.append(int(el))  
        clean.sort()
        name = names.pop(0)
    print("Most ancient artwork at", name, "dates back to", clean[0])
    print("Most recent artwork at", name, "dates back to", clean[-1])

Most ancient artwork at MoMa dates back to 1768
Most recent artwork at MoMa dates back to 2022
Most ancient artwork at Nga dates back to -490
Most recent artwork at Nga dates back to 2021
Most ancient artwork at Tate dates back to 1545
Most recent artwork at Tate dates back to 2012
Most ancient artwork at Met dates back to -240000
Most recent artwork at Met dates back to 2022


#Artworks acquisition

Clean MoMa's artworks' acquisition dates

In [59]:
MoMaNext = MoMaNew.copy(deep=True)
MoMaNext = MoMaNext[MoMaNext['DateAcquired'].notna()]
MoMaNext["DateAcquired"] = MoMaNext["DateAcquired"].apply(cleanDatesMoma)
MoMaNext.to_csv("MoMaNew.csv")

Acquisition' timespan

In [97]:
museums=['MoMaNew.csv','Met.csv', 'Nga.csv', 'Tate.csv']
names = ['MoMa','Met', 'Nga', 'Tate']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        acquisitionyears=[]
        for item in reader:
            if '.' in item['DateAcquired']:
                    item['DateAcquired']= item['DateAcquired'].split('.')[0]
            if item['DateAcquired'] != '' and item['DateAcquired'] != 'Object Number':
                acquisitionyears.append(item['DateAcquired'])
        acquisitionyears.sort()
        name = names.pop(0)
    print("First recorderd acquisition", name, "dates back to", acquisitionyears[0])
    print("Last recorded acquisition", name, "dates back to", acquisitionyears[-1])

First recorderd acquisition MoMa dates back to 1929
Last recorded acquisition MoMa dates back to 2022
First recorderd acquisition Met dates back to 1870
Last recorded acquisition Met dates back to 2022
First recorderd acquisition Nga dates back to 1937
Last recorded acquisition Nga dates back to 2022
First recorderd acquisition Tate dates back to 1823
Last recorded acquisition Tate dates back to 2013


Total artists' number.

In [129]:
def cleanArtistsTate(name):
    if ',' in name:
        splitted = name.split(',')
        name = ''.join(splitted) 
        return name

In [130]:
TateNew = Tate.copy(deep=True)
TateNew["Artist"] = TateNew["Artist"].apply(cleanArtistsTate)
TateNew.to_csv("TateNew.csv")

In [152]:
museums=['MoMaNew.csv', 'Met.csv', 'TateNew.csv', 'Nga.csv']
names = ['MoMa', 'Met','Tate', 'Nga']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        artists = set() 
        for item in reader:
            if item['Artist']!= '' and 'Unidentified'not in item['Artist'] and 'Various' not in item['Artist']:
                if ',' in item['Artist']:
                    item['Artist'] = item['Artist'].split(',')
                    for n in range(len(item['Artist'])):
                        artist= item['Artist'][n]
                        artists.add(artist)
                elif '|' in item['Artist']:
                    listone = item['Artist'].split('|')
                    for n in range(len(listone)):
                        artista= listone[n]
                        artists.add(artista)
                else:
                    artists.add(item['Artist'])
    name = names.pop(0)
    print("Number of artists at", name, "is", len(artists) )

Number of artists at MoMa is 14591
Number of artists at Met is 60950
Number of artists at Tate is 3281
Number of artists at Nga is 16860


Most represented gender in general?
Only for MoMa and Tate since other csv files need Wikidata integration

In [60]:
MoMaGender = MoMaNew.drop_duplicates(subset='Artist', keep="first")
MoMaGender.to_csv('MoMaGender.csv')

In [5]:
TateGender = Tate.drop_duplicates(subset='Artist', keep="first")
TateGender.to_csv('TateGender.csv')

In [7]:
museums=['MoMaGender.csv', 'TateGender.csv']
names = ['MoMa', 'Tate']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        Gender = {'Male':0, 'Female':0}
        for item in reader:
                if 'Male' in item['Gender']:
                    Gender['Male'] += 1
                if 'Female' in item['Gender']:
                    Gender['Female'] += 1
    print(Gender)

{'Male': 10474, 'Female': 2809}
{'Male': 2791, 'Female': 492}


Most represented nationality in general?

In [72]:
museums=['MoMaGender.csv']
names = ['MoMa']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        nationalities = set() 
        for item in reader:
            if ' ' in item['Nationality']:
                    item['Nationality'] = item['Nationality'].split(') ')
                    for n in range(len(item['Nationality'])):
                        nationality= item['Nationality'][n]
                        nationalities.add(nationality)
            else:
                    nationalities.add(item['Nationality'])
        nationalities_clean = set()
        for el in nationalities:
            a = el.replace('(', '').replace(')','')
            nationalities_clean.add(a)
        count_naz= set()
        for el in nationalities_clean:
            if el != '' and el != 'Nationality unknown':
                count_naz.add(el)
print(count_naz)
print(len(count_naz))


{'Palestinian', 'Canadian Inuit', 'Mexican', 'Portuguese', 'Tunisian', 'Ivorian', 'Tanzanian', 'Polish', 'Cameroonian', 'Ugandan', 'Taiwanese', 'Native American', 'Nicaraguan', 'Bangladeshi', 'Chilean', 'Georgian', 'Israeli', 'New Zealander', 'Bulgarian', 'Sierra Leonean', 'Catalan', 'Emirati', 'Costa Rican', 'Slovenian', 'Senegalese', 'Beninese', 'Colombian', 'Greek', 'Burkinabé', 'Icelandic', 'Spanish', 'Kuwaiti', 'Romanian', 'Ecuadorian', 'Malian', 'Filipino', 'Lebanese', 'Singaporean', 'Moroccan', 'Cypriot', 'Kenyan', 'American', 'Latvian', 'Egyptian', 'Venezuelan', 'Dutch', 'Canadian', 'Lithuanian', 'Panamanian', 'Swiss', 'Albanian', 'French', 'Salvadoran', 'Haitian', 'Puerto Rican', 'Peruvian', 'Uruguayan', 'Macedonian', 'Thai', 'Malaysian', 'Namibian', 'Guatemalan', 'Pakistani', 'Yugoslav', 'Argentine', 'Zimbabwean', 'Brazilian', 'Welsh', 'Japanese', 'Kyrgyz', 'Norwegian', 'Paraguayan', 'Estonian', 'South African', 'Hungarian', 'Irish', 'Cambodian', 'Bahamian', 'Ethiopian', 'Ita

In [37]:
museums=['MoMaGender.csv']
names = ['MoMa']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        Nationalities ={}
        for nationality in count_naz:  
            Nationalities[nationality]= 0
        for item in reader:
            for naz in count_naz:
                    if naz in item['Nationality']:
                        Nationalities[naz] += 1 
    print(Nationalities)

{'Welsh': 4, 'Ugandan': 1, 'Paraguayan': 3, 'Georgian': 22, 'Bangladeshi': 1, 'Belgian': 125, 'Albanian': 4, 'Congolese': 6, 'Native American': 10, 'Bulgarian': 5, 'Irish': 24, 'Peruvian': 37, 'Italian': 517, 'Argentine': 150, 'Kuwaiti': 1, 'American': 5584, 'Costa Rican': 2, 'Estonian': 2, 'Guatemalan': 7, 'Swedish': 123, 'Haitian': 16, 'Hungarian': 86, 'Japanese': 545, 'British': 962, 'Turkish': 20, 'Serbian': 17, 'Coptic': 1, 'Mexican': 159, 'Senegalese': 2, 'Danish': 125, 'Pakistani': 4, 'Scottish': 22, 'Vietnamese': 3, 'Lebanese': 10, 'Cypriot': 1, 'Swiss': 421, 'Spanish': 185, 'Burkinabé': 1, 'Moroccan': 8, 'Afghan': 1, 'Icelandic': 21, 'Catalan': 1, 'Tunisian': 2, 'Malian': 3, 'Bahamian': 1, 'Luxembourger': 3, 'Australian': 59, 'Mozambican': 1, 'Uruguayan': 24, 'Austrian': 277, 'Emirati': 1, 'South African': 68, 'Romanian': 27, 'Iranian': 11, 'New Zealander': 10, 'Israeli': 77, 'Indian': 39, 'Palestinian': 3, 'Finnish': 52, 'Czech': 98, 'Greek': 13, 'Cuban': 71, 'Salvadoran': 2,

In [None]:
counts = MoMa['Nationality'].value_counts()
counts.to_csv('nationalities.csv')

In [14]:
def cleanNazMet(naz):
    if ',' in naz:
        naz = naz.split(',')[0]
    if '(' in naz:
        naz = naz.split('(')[0]
    if '?' in naz:
        naz = naz.replace('?', '')
    return naz
    

In [15]:
MetNew = Met.copy(deep=True)
MetNew = MetNew[MetNew['Nationality'].notna()]
MetNew["Nationality"] = MetNew["Nationality"].apply(cleanNazMet)
MetNew.to_csv("MetNew.csv")

In [16]:
museums=['MetNew.csv']
names = ['Met']
for museum in museums:
    with open(museum, mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        nationalities = set() 
        for item in reader:
            if '|' in item['Nationality']:
                    item['Nationality'] = item['Nationality'].split('|')
                    for n in range(len(item['Nationality'])):
                        nationality= item['Nationality'][n]
                        nationalities.add(nationality)
            else:
                nationalities.add(item['Nationality'])
'''  
        clean_naz = set()
        for el in nationalities:
            if ',' in el:
                el = el.split(',')[0]
            if '(' in el:
                el = el.split('(')[0]
            clean_naz.add(el)    
        final_naz = set()
        for el in clean_naz:
            if el != '':
                final_naz.add(el)
'''  
print(nationalities)

{'Italian [search purposes only]', 'Polish Lithunanian', 'Netherlandish ', 'Northern France', 'Alsatian', 'Swiss', 'Norwegian', 'Canadian ', 'Panamanian', 'Swedish', 'Iran', 'Naples', 'Brazilian', 'Netherlandish', 'French/Flemish', 'Belgian', 'North African ', 'South African', 'Netherlandish or German', 'Emilian', 'Paris', 'Swiss American', 'Klingenthal', 'Fench', 'Rumanian', 'Bremen', 'probably Swiss', 'Haida', 'Nigeria', 'England', 'Algonquin family', 'Finnish', 'Scandinavian', 'Venezuelan', 'American', 'British or American', 'Ukrainian', 'U.S. ', 'American born Britain', ' ', 'Africa', 'active in France by 1894', 'Prussian', 'BRitish', 'Sri Lankan', 'British ', 'Chilean', 'Continental/Swiss', 'German/Solingen', 'London', 'German ', 'Guatemalan', 'Tibetan', 'French Dutch', 'British or French', 'American or German', 'Austrian German', 'Irish', 'Afghani', 'Islamic', 'Italy', 'Norway', 'Scottish or British', 'Hungarian or Austrian', 'Egyptian', 'European', 'Japanese and German', 'Burgun

In [19]:
with open('MetNew.csv', mode='r', encoding='utf-8') as csvfile:
        reader = csv.DictReader(csvfile)
        MetNationalities ={}
        for nationality in nationalities:  
            MetNationalities[nationality]= 0
        for item in reader:
            for naz in nationalities:
                    if naz in item['Nationality']:
                        MetNationalities[naz] += 1 
print(MetNationalities)

{'Italian [search purposes only]': 1, 'Polish Lithunanian': 1, 'Netherlandish ': 3, 'Northern France': 5, 'Alsatian': 1, 'Swiss': 1008, 'Norwegian': 85, 'Canadian ': 2, 'Panamanian': 1, 'Swedish': 294, 'Iran': 44, 'Naples': 1, 'Brazilian': 46, 'Netherlandish': 5309, 'French/Flemish': 1, 'Belgian': 476, 'North African ': 1, 'South African': 30, 'Netherlandish or German': 1, 'Emilian': 1, 'Paris': 3, 'Swiss American': 1, 'Klingenthal': 1, 'Fench': 9, 'Rumanian': 2, 'Bremen': 1, 'probably Swiss': 1, 'Haida': 3, 'Nigeria': 22, 'England': 3, 'Algonquin family': 1, 'Finnish': 136, 'Scandinavian': 5, 'Venezuelan': 15, 'American': 68664, 'British or American': 5, 'Ukrainian': 28, 'U.S. ': 1, 'American born Britain': 3, ' ': 111644, 'Africa': 39, 'active in France by 1894': 1, 'Prussian': 1, 'BRitish': 1, 'Sri Lankan': 1, 'British ': 31, 'Chilean': 18, 'Continental/Swiss': 2, 'German/Solingen': 1, 'London': 4, 'German ': 49, 'Guatemalan': 2, 'Tibetan': 2, 'French Dutch': 1, 'British or French':

#Our analysis.
<br>
1. What are the most acquired artists in museums (in general)?
    - Is there a gender gap in the selection of artists?
    - What are the most represented nationalities (in general)?
    - What are the most represented movements or genres (in general)?
2. How have acquisition criteria changed (over time) in museums?
    - In which years are artists' works mostly acquired?
    - When does the gender gap decreases (if it does)?
    - In which years artists' nationalities more influent on the selection?
    - In which years artists' movements/genres more influent on the selection?
3. If we compare criteria of all museums, in general and over time, do we see any similarity or significant difference?
    - Do certain museums acquire more works based on artists/artists' gender/nationality/movement than others?

Acquisition criteria.
1.  In which years are artists' works mostly acquired?<br>
To answer, we need to count how many times each year shows up in the DateAcquired column.

In [None]:
MoMa['year'] = pd.DatetimeIndex(MoMa['DateAcquired']).year
MoMaNew['DateAcquired'] = MoMa['year']
MoMaNew = MoMaNew[MoMaNew['DateAcquired'].notna()]
MoMaNew.to_csv("MoMaNew.csv")

In [None]:
with open('MoMaNew.csv', mode='r', encoding='utf-8') as csvfile:
    reader = csv.DictReader(csvfile)
    years={}
    for item in reader:
        if item['DateAcquired']not in years:
            years[item['DateAcquired']]= 1
        else:
            years[item['DateAcquired']]+= 1

    print(years)
    #all_years=list(years.keys())
    #print(sorted(all_years))
    

Trova modo di ordinare per value o usa la libreria che sa Laura
Visualizzazione :)

In [None]:
new_dict={}
for key in years:
    key_int=key.split('.')[0]
    key_int=int(key_int)
    if key_int in range(1928,1941):
        if '1930s' not in new_dict.keys():
               new_dict['1930s']= years[key]
        else:
            new_dict['1930s'] += years[key]
    if key_int in range(1940,1951):
        if '1940s' not in new_dict.keys():
               new_dict['1940s']= years[key]
        else:
            new_dict['1940s'] += years[key]
    
    if key_int in range(1950,1961):
        if '1950s' not in new_dict.keys():
               new_dict['1950s']= years[key]
        else:
            new_dict['1950s'] += years[key]
    
    if key_int in range(1960,1971):
        if '1960s' not in new_dict.keys():
               new_dict['1960s']= years[key]
        else:
            new_dict['1960s'] += years[key]
    
    if key_int in range(1970,1981):
        if '1970s' not in new_dict.keys():
               new_dict['1970s']= years[key]
        else:
            new_dict['1970s'] += years[key]
    if key_int in range(1980,1991):
        if '1980s' not in new_dict.keys():
               new_dict['1980s']= years[key]
        else:
            new_dict['1980s'] += years[key]
    
    if key_int in range(1990,2001):
        if '1990s' not in new_dict.keys():
               new_dict['1990s']= years[key]
        else:
            new_dict['1990s'] += years[key]
        
    
print(new_dict)
    

When does the gender gap decreases (if it does)? <br>
Per ogni 10 anni, percentuale di uomini e donne acquisiti e differenza.

In [None]:
with open('MoMaNew.csv', mode='r', encoding='utf-8') as csvfile:
    reader = csv.DictReader(csvfile)
    years={}
    for item in reader:
        if item['DateAcquired']not in years:
            years[item['DateAcquired']]= 1
        else:
            years[item['DateAcquired']]+= 1

    print(years)

When does the gender gap decreases (if it does)?
Per ogni 10 anni, percentuale di uomini e donne acquisiti e differenza.

In which years artists' nationalities more influent on the selection?
Per ogni 10 anni, percentuale di nazionalità acquisite e differenza.

In which years artists'movements/genres more influent on the selection?
Per ogni 10 anni, percentuale di nazionalità acquisite e differenza.

Nga non ha Gender
Met molti Gender sono NaN
