# Example of sourcing data combining Search and Record API

This notebook leverages the API python wrapper to build a series of API calls with the Search and Record API.
The idea to use both Search API and Record API is because  looking up results only with the Search API does not always allow to retrieve all results, some fields are not shown,while the Record API related to an object contains all fields.
For example, not all multilingual fields are served in the Search API.
If we wanted to retrieve data for multilingual fields that are not served by the Search API an option is to use the Search API in combination with the Record API to retrieve all fields for a specific item.


This notebook leverages the SEARCH and RECORD API and the Python Wrapper developed by the RnD team

You can find more details about the search and the record APIs in the following links:

*   https://pro.europeana.eu/page/search
*   https://pro.europeana.eu/page/record

Those APIs serve data using the Europeana Data Model: https://pro.europeana.eu/page/intro#edm


In [4]:
#Importing libraries
import pandas as pd
import pyeuropeana.apis as apis
import pyeuropeana.utils as utils
import os
pd.set_option("display.max_rows", 600)
pd.set_option("display.max_columns", 600)
pd.options.mode.chained_assignment = None

 # Combination Search and Record API to retrieve field values

Here  example we look for the field value that for  proxy_dc_type.it
proxy_dc_type is indexed in Solr but the results are not returned by the Search API. 
Therefore to retrive the value of it fields for a particolar record we combine the use of record and search API.

In [5]:
utils.process_CHO_record()

TypeError: process_CHO_record() missing 1 required positional argument: 'response'

In [6]:
#setting enviroment variable
os.environ['EUROPEANA_API_KEY'] = 'api2demo'

# Functions

In [4]:

def language_queries(lang1,bilingual,lang2='en'):
    """ This function builds a set of queries to extract information  
    on monolingual or bilingual fields, where one of the fields is english 
    by default 
    lang1: string first language, use ISO code (ex: fr for french)
    bilingual: string possible values are "AND" or "NOT" to select monolingual or 
    bilingual queries respectively
    """
    queries={
        'dc_description': f'(proxy_dc_description.{lang1}:* {bilingual} proxy_dc_description.{lang2}:*)',
        'dc_title': f'(proxy_dc_title.{lang1}:* {bilingual} proxy_dc_title.{lang2}:*)',
        'dc_subject': f'(proxy_dc_subject.{lang1}:* {bilingual} proxy_dc_subject.{lang2}:*)',
        'dc_coverage': f'(proxy_dc_coverage.{lang1}:* {bilingual} proxy_dc_coverage.{lang2}:*)',
        'edm_current_location':f'(proxy_edm_currentLocation.{lang1}:* {bilingual} proxy_edm_currentLocation.{lang2}:*)',
        'dcterms_medium': f'(proxy_dcterms_medium.{lang1}:* {bilingual} proxy_dcterms_medium.{lang2}:*)',
        'dcterms_hasPart':f'(proxy_dcterms_hasPart.{lang1}:* {bilingual} proxy_dcterms_hasPart.{lang2}:*)',
        'dcterms_spatial':f'(proxy_dcterms_spatial.{lang1}:* {bilingual} proxy_dcterms_spatial.{lang2}:*)',
        'dc_format':f'(proxy_dc_format.{lang1}:* {bilingual} proxy_dc_format.{lang2}:*)',
        'dc_source':f'(proxy_dc_source.{lang1}:* {bilingual} proxy_dc_source.{lang2}:*)',
        'dc_rights':f'(proxy_dc_rights.{lang1}:* {bilingual} proxy_dc_rights.{lang2}:*)',
        'dc_terms_alternative':f'(proxy_dcterms_alternative.{lang1}:* {bilingual} proxy_dcterms_alternative.{lang2}:*)',
        'dc_type': f'(proxy_dc_type.{lang1}:* {bilingual} proxy_dc_type.{lang2}:*)',
        'dcterms_isPartOf': f'(proxy_dcterms_isPartOf.{lang1}:* {bilingual} proxy_dcterms_isPartOf.{lang2}:*)',
        'dcterms_provenance': f'(proxy_dcterms_provenance.{lang1}:* {bilingual} proxy_dcterms_provenance.{lang2}:*)',
        'dcterms_temporal': f'(proxy_dcterms_temporal.{lang1}:* {bilingual} proxy_dcterms_temporal.{lang2}:*)',
        'edm_isRelatedTo': f'(proxy_edm_isRelatedTo.{lang1}:* {bilingual} proxy_edm_isRelatedTo.{lang2}:*)',
        'edm_dataProvider': f'(provider_aggregation_edm_dataProvider.{lang1}:* {bilingual} provider_aggregation_edm_dataProvider.{lang2}:*)',
        'edm_intermediateProvider': f'(provider_aggregation_edm_intermediateProvider.{lang1}:* {bilingual} provider_aggregation_edm_intermediateProvider.{lang2}:*)',
        'edm_provider': f'(provider_aggregation_edm_provider.{lang1}:* {bilingual} provider_aggregation_edm_provider.{lang2}:*)',
        'dcterms_isReferencedBy': f'(wr_dcterms_isReferencedBy.{lang1}:* {bilingual} wr_dcterms_isReferencedBy.{lang2}:*)'
            }
    return queries

In [5]:
def multiple_language_queries(lang ,bilingual, lang2='en'):
    """This function builds a dictionary where each key is a 
    language and the values are all the queries from function language_queries 
    lang: lists of languages, ISO format
    bilingual: string possible values are "AND" or "NOT" to select monolingual or 
    bilingual queries
    """
    queries={}
    for l in lang:
        queries_single_l=language_queries(l,bilingual,lang2='en')
        queries[l]=queries_single_l
    return queries

In [6]:
def tot_results_queries(lang ,n_rows=1, save=False, biling=True):
    """This function returns a dataframe , 
    the first column indicates the metadata considered   
    the second column the number of hits for that specific field
    the index of the dataframe are the languages in the parameter lang
    lang:  languages, ISO format
    save: boolean, if TRUE the resulting dataframe is saved as csv file
    n_rows: parameter for the number of returned items
    """
    if biling:
        queries_dict=multiple_language_queries(lang ,bilingual='AND', lang2='en')
    else:
        queries_dict=multiple_language_queries(lang ,bilingual='NOT', lang2='en')
    df=pd.DataFrame(index=lang)
    for l in lang:
        for key, value in queries_dict[l].items():  
            CHO_data = apis.search(query = '*:*',qf=f'{value}' ,rows = n_rows)
            tot_results=CHO_data['totalResults']
            df.loc[l,key]=tot_results 
    df.loc[:,'Tot_results'] = df.iloc[:,:].sum(axis=1)
    if 'en' in df.index:
        df.drop('en', axis=0, inplace=True)
    df_percentage=pd.DataFrame(columns= df.columns, index=df.index)
    for col in df.columns:
        df_percentage[col]=df[col]/df.Tot_results
    df_percentage.drop('Tot_results',axis=1, inplace=True)
    df_percentage.loc[:,'Tot_results'] = df_percentage.iloc[:,:].sum(axis=1)
    if 'en' in df_percentage.index:
        df_percentage.drop('en', axis=0, inplace=True)
    tot_lang='_'.join(lang)
    if save and biling:
        df.to_csv(f'{today}_{tot_lang}_tot_results_bilingual.csv')
    elif save and not biling:
        df.to_csv(f'{today}_{tot_lang}_tot_results_monolingual.csv')
    return df, df_percentage


In [7]:
def queries_items_uri(lang, n_rows=1, save=False,biling=True):
    """This function build a dataframe where the first column indicates
    the type of query executed and the second the item that satisfies that query
    lang: string language in ISO format, takes one value of lang (not a list)
    save: boolean, if TRUE the resulting dataframe is saved as excel file
    n_rows: parameter for the number of returned items
    biling: boolean, if TRUE bilingual version of the queries is used, if FALSE the 
    monolingual version"""
    if biling:
        queries_dict=multiple_language_queries([lang] ,bilingual='AND', lang2='en')
    else:
        queries_dict=multiple_language_queries([lang] ,bilingual='NOT', lang2='en')
    # initalizing list of dataframes
    df_list=[]
    for _ ,value in queries_dict[lang].items():  
        print(value)
        df=pd.DataFrame(columns=['field','europeana_uri'])
        CHO_data = apis.search(query = '*:*',qf=f'{value}' ,rows = n_rows)
        n_files=CHO_data['totalResults']
        if n_files > 0:
            CHO_data_all = apis.search(query = '*:*',qf=f'{value}' ,rows = n_files)
            print('ok')
            df['europeana_uri']=utils.search2df(CHO_data_all).uri
            print(len(df))
            df['field']=value
            df_list.append(df)     
        else:
            pass 
    df_tot = pd.concat(df_list, ignore_index=True) # concatenate all dataframes from all queries
    df_tot_clear_dup=df_tot.drop_duplicates(subset=None, keep='first', inplace=False)
    if save and biling:
        df_tot_clear_dup.to_csv(f'{today}_{lang}_en_bilingual.csv')
    elif save and not biling:
        df_tot_clear_dup.to_csv(f'{today}_{lang}_monolingual.csv')
    return df_tot_clear_dup

In [8]:
def monoling_biling_to_stack(mono_nr, bili_nr, lang_list,save=True):
    """ This function generates a dataframe whose columns are 
    - the number of bilingual tags,
    - the number of monolingual tags
    - the number of total tags
    The index of the df are the languages contained in lang_list
    Three version of the dataframe are generated
    - df_sorted_biling: rows sorted per descendinng values of bilignual tags
    - df_sorted_monloling: rows sorted per descendinng values of monoling tags
    -df_sorted_tot_lang_tagged: rows sorted per descendinng values of total tags
      monlolingual and bilingual
      Parameters
      mono_nr: number of monlingual hits per language- series
      bilin_nr: number of monlingual hits per language- series
      lang_list: list of languages considered
    """
    df_tot=pd.DataFrame({'n_biling_tag':bili_nr.Tot_results,'n_monoling_tag':mono_nr.Tot_results}, index=lang_list)
    df_tot.loc[:,'Tot_lang_tag']=df_tot.loc[:,'n_biling_tag']+df_tot.loc[:,'n_monoling_tag']
    if 'en'in df_tot.index:
        df_sorted=df_tot.drop('en',axis=0)  
    else:
        df_sorted=df_tot
    df_sorted_biling=df_sorted.sort_values(by='n_biling_tag', ascending=False)
    df_sorted_monoling=df_sorted.sort_values(by='n_monoling_tag', ascending=False)
    df_sorted_tot_lang_tagged=df_sorted.sort_values(by='Tot_lang_tag', ascending=False)
    if save:
        file_name=f'{today}_mono_bilingual_tot_results.csv'
        df_sorted_biling.to_csv(file_name)
    return df_sorted_biling,df_sorted_monoling,df_sorted_tot_lang_tagged

In [45]:
#Function to extract europeana_id numbers that correspond to a certain query
def search_api(query, n_rows):
    response = apis.search(
    query = query,
    rows = n_rows, 
    profile='rich'
    )
    df_search=utils.search2df(response).europeana_id
    return df_search

In [46]:
#Function to extract data that correspond to the id numbers found in search_api function
def record_api(items_id):
    df_list=[]
    for item in items_id:
        data=apis.record(f'{item}')
        df_0=pd.json_normalize(data,['object','proxies'])
        df_proxy_provider=df_0.iloc[1] #selcting provider proxy - there are the info I am interested in
        df_proxy_provider=pd.DataFrame(df_proxy_provider)
        df_proxy_provider=df_proxy_provider.transpose()
        df_list.append(df_proxy_provider)
    df_proxy_tot = pd.concat(df_list, ignore_index=True, axis=0)
    return df_proxy_tot

In [50]:
def record_api(items_id):
    df_list=[]
    for item in items_id:
        data=apis.record(f'{item}')
        data_jnorm=pd.json_normalize(data)
        df_list.append(data_jnorm)
    df_jnorm_tot = pd.concat(df_list, ignore_index=True, axis=0)
    return df_jnorm_tot

In [54]:
def retrieve_norm_data(query, n_rows):
    search_results_list=search_api(query, n_rows)
    record_data=record_api(search_results_list)
    return record_data

In [38]:
#Example of query--> id_numbers that correspond to it --> using record api to extract data on proxy_provider
# and place them in dataframe format
query= '(proxy_dc_type.en:* AND proxy_dc_type.de:*)' #dc_type not served in Search API results
df_search=search_api(query)

In [55]:
query= '(proxy_dc_type.en:* AND proxy_dc_type.de:*)'

In [56]:
retrieve_norm_data(query, 5)

Unnamed: 0,apikey,success,statsDuration,requestNumber,object.about,object.aggregations,object.concepts,object.edmDatasetName,object.europeanaAggregation.about,object.europeanaAggregation.aggregatedCHO,object.europeanaAggregation.edmCountry.def,object.europeanaAggregation.edmLanguage.def,object.europeanaAggregation.edmRights.def,object.europeanaAggregation.edmPreview,object.europeanaAggregation.edmLandingPage,object.europeanaAggregation.dqvHasQualityAnnotation,object.europeanaCollectionName,object.europeanaCompleteness,object.organizations,object.providedCHOs,object.proxies,object.qualityAnnotations,object.timespans,object.timestamp_created,object.timestamp_created_epoch,object.timestamp_update,object.timestamp_update_epoch,object.type
0,api2demo,True,240,999,/9200360/BibliographicResource_3000100168721,[{'about': '/aggregation/provider/9200360/Bibl...,[{'about': 'http://vocab.getty.edu/aat/3000266...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],/aggregation/europeana/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168721,[Luxembourg],[de],[http://rightsstatements.org/vocab/InC/1.0/],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/9200360/Bibliogr...,[/item/9200360/BibliographicResource_300010016...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],6,[{'about': 'http://data.europeana.eu/organizat...,[{'about': '/item/9200360/BibliographicResourc...,[{'about': '/proxy/europeana/9200360/Bibliogra...,[{'about': '/item/9200360/BibliographicResourc...,"[{'about': 'http://semium.org/time/AD2xxx', 'p...",2014-10-02T10:16:12.379Z,1412244972379,2014-10-02T10:16:12.379Z,1412244972379,TEXT
1,api2demo,True,236,999,/9200360/BibliographicResource_3000100168720,[{'about': '/aggregation/provider/9200360/Bibl...,[{'about': 'http://vocab.getty.edu/aat/3000266...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],/aggregation/europeana/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168720,[Luxembourg],[de],[http://rightsstatements.org/vocab/InC/1.0/],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/9200360/Bibliogr...,[/item/9200360/BibliographicResource_300010016...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],6,[{'about': 'http://data.europeana.eu/organizat...,[{'about': '/item/9200360/BibliographicResourc...,[{'about': '/proxy/europeana/9200360/Bibliogra...,[{'about': '/item/9200360/BibliographicResourc...,"[{'about': 'http://semium.org/time/AD2xxx', 'p...",2014-10-02T10:16:12.378Z,1412244972378,2014-10-02T10:16:12.378Z,1412244972378,TEXT
2,api2demo,True,242,999,/9200360/BibliographicResource_3000100168719,[{'about': '/aggregation/provider/9200360/Bibl...,[{'about': 'http://vocab.getty.edu/aat/3000266...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],/aggregation/europeana/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168719,[Luxembourg],[de],[http://rightsstatements.org/vocab/InC/1.0/],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/9200360/Bibliogr...,[/item/9200360/BibliographicResource_300010016...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],6,[{'about': 'http://data.europeana.eu/organizat...,[{'about': '/item/9200360/BibliographicResourc...,[{'about': '/proxy/europeana/9200360/Bibliogra...,[{'about': '/item/9200360/BibliographicResourc...,"[{'about': 'http://semium.org/time/AD2xxx', 'p...",2014-10-02T10:16:12.359Z,1412244972359,2014-10-02T10:16:12.359Z,1412244972359,TEXT
3,api2demo,True,228,999,/9200360/BibliographicResource_3000100168718,[{'about': '/aggregation/provider/9200360/Bibl...,[{'about': 'http://vocab.getty.edu/aat/3000266...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],/aggregation/europeana/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168718,[Luxembourg],[de],[http://rightsstatements.org/vocab/InC/1.0/],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/9200360/Bibliogr...,[/item/9200360/BibliographicResource_300010016...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],6,[{'about': 'http://data.europeana.eu/organizat...,[{'about': '/item/9200360/BibliographicResourc...,[{'about': '/proxy/europeana/9200360/Bibliogra...,[{'about': '/item/9200360/BibliographicResourc...,"[{'about': 'http://semium.org/time/AD2xxx', 'p...",2014-10-02T10:16:12.357Z,1412244972357,2014-10-02T10:16:12.357Z,1412244972357,TEXT
4,api2demo,True,245,999,/9200360/BibliographicResource_3000100168717,[{'about': '/aggregation/provider/9200360/Bibl...,[{'about': 'http://vocab.getty.edu/aat/3000266...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],/aggregation/europeana/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168717,[Luxembourg],[de],[http://rightsstatements.org/vocab/InC/1.0/],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/9200360/Bibliogr...,[/item/9200360/BibliographicResource_300010016...,[9200360_Ag_EU_TEL_a0639_Newspapers_Luxembourg],6,[{'about': 'http://data.europeana.eu/organizat...,[{'about': '/item/9200360/BibliographicResourc...,[{'about': '/proxy/europeana/9200360/Bibliogra...,[{'about': '/item/9200360/BibliographicResourc...,"[{'about': 'http://semium.org/time/AD2xxx', 'p...",2014-10-02T10:16:12.350Z,1412244972350,2014-10-02T10:16:12.350Z,1412244972350,TEXT


In [39]:
df_search

0    /9200360/BibliographicResource_3000100168721
1    /9200360/BibliographicResource_3000100168720
2    /9200360/BibliographicResource_3000100168719
3    /9200360/BibliographicResource_3000100168718
4    /9200360/BibliographicResource_3000100168717
Name: europeana_id, dtype: object

In [38]:
df_proxy_tot=record_api(df_search)
df_proxy_tot.head()

Unnamed: 0,about,proxyIn,proxyFor,edmType,europeanaProxy,dcDate.def,dcType.def,dctermsTemporal.def,year.def,edmIsNextInSequence,dcDescription.def,dcIdentifier.def,dcLanguage.def,dcTitle.def,dcType.de,dcType.en,dcType.fr,dctermsIsPartOf.def,dctermsIssued.def
0,/proxy/provider/9200360/BibliographicResource_...,[/aggregation/provider/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168721,TEXT,False,[2003],"[http://vocab.getty.edu/aat/300026656, Newspap...",,,[http://data.theeuropeanlibrary.org/Bibliograp...,[d'Lëtzebuerger Land 2003-05-09],[8b438a52-6e9d-48d2-bbd7-163a212714b3],"[lb, de, fr]",[d'Lëtzebuerger Land - 2003-05-09],[Gedruckte Periodika],"[Analytic serial, printed serial]",[publication en série imprimée],[http://data.theeuropeanlibrary.org/Bibliograp...,[2003-05-09]
1,/proxy/provider/9200360/BibliographicResource_...,[/aggregation/provider/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168720,TEXT,False,[2004],"[http://vocab.getty.edu/aat/300026656, Newspap...",,,[http://data.theeuropeanlibrary.org/Bibliograp...,[d'Lëtzebuerger Land 2004-11-05],[52c6095d-f9bc-44d0-9526-b08e49fd844c],"[lb, de, fr]",[d'Lëtzebuerger Land - 2004-11-05],[Gedruckte Periodika],"[Analytic serial, printed serial]",[publication en série imprimée],[http://data.theeuropeanlibrary.org/Bibliograp...,[2004-11-05]
2,/proxy/provider/9200360/BibliographicResource_...,[/aggregation/provider/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168719,TEXT,False,[2004],"[http://vocab.getty.edu/aat/300026656, Newspap...",,,[http://data.theeuropeanlibrary.org/Bibliograp...,[d'Lëtzebuerger Land 2004-12-17],[0cea4a23-188c-400c-b140-c715800d67db],"[lb, de, fr]",[d'Lëtzebuerger Land - 2004-12-17],[Gedruckte Periodika],"[Analytic serial, printed serial]",[publication en série imprimée],[http://data.theeuropeanlibrary.org/Bibliograp...,[2004-12-17]
3,/proxy/provider/9200360/BibliographicResource_...,[/aggregation/provider/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168718,TEXT,False,[2005],"[http://vocab.getty.edu/aat/300026656, Newspap...",,,[http://data.theeuropeanlibrary.org/Bibliograp...,[d'Lëtzebuerger Land 2005-11-25],[0fcde0b2-cf98-4f5c-9b83-74b06b17cedf],"[lb, de, fr]",[d'Lëtzebuerger Land - 2005-11-25],[Gedruckte Periodika],"[Analytic serial, printed serial]",[publication en série imprimée],[http://data.theeuropeanlibrary.org/Bibliograp...,[2005-11-25]
4,/proxy/provider/9200360/BibliographicResource_...,[/aggregation/provider/9200360/BibliographicRe...,/item/9200360/BibliographicResource_3000100168717,TEXT,False,[2005],"[http://vocab.getty.edu/aat/300026656, Newspap...",,,[http://data.theeuropeanlibrary.org/Bibliograp...,[d'Lëtzebuerger Land 2005-08-12],[2706ca1c-8905-4161-a5a1-9bc493dc5984],"[lb, de, fr]",[d'Lëtzebuerger Land - 2005-08-12],[Gedruckte Periodika],"[Analytic serial, printed serial]",[publication en série imprimée],[http://data.theeuropeanlibrary.org/Bibliograp...,[2005-08-12]


In [None]:
query= '(proxy_dc_type.it:*)' #dc_type not served in Search API results
df_search=search_api(query)
df_search


provider_aggregation_edm_dataProvider
provider_aggregation_edm_intermediateProvider
provider_aggregation_edm_provider
wr_dcterms_isReferencedBy

In [13]:
queries= ['(wr_dcterms_isReferencedBy:*)'] #dc_type not served in Search API results
df_list=[]
for query in queries:
    df_search=search_api(query)
    df_proxy_tot=record_api(df_search)
    df_list.append(df_proxy_tot) 
df_tot = pd.concat(df_list, ignore_index=True, axis=0)


In [14]:
query= '(*:*)'
item=search_api(query)[2]
data=apis.record(f'{item}')
data
item

'/99/_Resource_103655663'

In [230]:
utils.process_CHO_record(data)

{'europeana_id': '/449/libria_317731',
 'image_url': 'https://www.byterfly.eu/iiif-server/iiif/2/http%3A%2F%2F150.145.48.48%3A8080%2Ffedora%2Fobjects%2Flibria:317746%2Fdatastreams%2FJP2%2Fcontent/full/full/0/default.jpg',
 'uri': 'http://data.europeana.eu/item/449/libria_317731',
 'dataset_name': '449_CulturaItalia_Byterfly_Leconomista',
 'country': 'Italy',
 'language': 'it',
 'type': 'TEXT',
 'title': "Avviso al popolo sul bisogno suo primario o sia Trattato sulla totale e perfetta libertà nel commercio de' grani",
 'title_lang': {'def': "Avviso al popolo sul bisogno suo primario o sia Trattato sulla totale e perfetta libertà nel commercio de' grani"},
 'rights': 'http://rightsstatements.org/vocab/NoC-OKLR/1.0/',
 'provider': 'http://data.europeana.eu/organization/1482250000000338951',
 'provider_lang': {'def': 'http://data.europeana.eu/organization/1482250000000338951'}}

In [None]:
data

In [16]:
flat=pd.json_normalize(data)
flat

Unnamed: 0,apikey,success,statsDuration,requestNumber,object.about,object.agents,object.aggregations,object.concepts,object.edmDatasetName,object.europeanaAggregation.about,object.europeanaAggregation.aggregatedCHO,object.europeanaAggregation.edmCountry.def,object.europeanaAggregation.edmLanguage.def,object.europeanaAggregation.edmPreview,object.europeanaAggregation.edmLandingPage,object.europeanaAggregation.dqvHasQualityAnnotation,object.europeanaCollectionName,object.europeanaCompleteness,object.organizations,object.places,object.providedCHOs,object.proxies,object.qualityAnnotations,object.timespans,object.timestamp_created,object.timestamp_created_epoch,object.timestamp_update,object.timestamp_update_epoch,object.type
0,api2demo,True,454,999,/99/_Resource_103655663,"[{'about': '#Name:22693', 'prefLabel': {'def':...",[{'about': '/aggregation/provider/99/_Resource...,"[{'about': '#Keyword:1964', 'prefLabel': {'en'...",[99_RoL_NLScotland_RareBooksDigitalGallery_pdf],/aggregation/europeana/99/_Resource_103655663,/item/99/_Resource_103655663,[United Kingdom],[mul],https://api.europeana.eu/thumbnail/v2/url.json...,https://www.europeana.eu/item/99/_Resource_103...,"[/item/99/_Resource_103655663#contentTier, /it...",[99_RoL_NLScotland_RareBooksDigitalGallery_pdf],7,[{'about': 'http://data.europeana.eu/organizat...,"[{'about': '#Place:1466', 'prefLabel': {'en': ...",[{'about': '/item/99/_Resource_103655663'}],[{'about': '/proxy/europeana/99/_Resource_1036...,[{'about': '/item/99/_Resource_103655663#conte...,"[{'about': 'http://semium.org/time/1875', 'pre...",2019-08-02T13:56:41.077Z,1564754201077,2019-08-02T13:56:41.077Z,1564754201077,TEXT


In [201]:
'object.aggregations' in flat.columns

True

In [33]:
if 'object.aggregations'in flat.columns:
    flat_1=pd.json_normalize(data, record_path=['object','aggregations']) 
else:
    flat_1=pd.DataFrame()
if 'object.organizations'in flat.columns:    
    flat_2=pd.json_normalize(data, record_path=['object','organizations'])
else:
    flat_2=pd.DataFrame()
# if 'object.aggregations'in flat.columns:        
#     flat_3=pd.json_normalize(data, record_path=['object','aggregations','webResources'])
# else:
    #flat_3=pd.DataFrame()
if 'object.places'in flat.columns:  
    flat_3=pd.json_normalize(data, record_path=['object','places'])
else:
    flat_3=pd.DataFrame()
if 'object.providedCHOs'in flat.columns:
    flat_4=pd.json_normalize(data, record_path=['object','providedCHOs'])
else:
    flat_4=pd.DataFrame()
if 'object.proxies'in flat.columns:
    #Here I select the proider proxy
    flat_5=pd.json_normalize(data, record_path=['object','proxies'])
    flat_5_filt=flat_6.loc[[1]].reset_index(drop=True)
else:
    flat_5_filt=pd.DataFrame()
# if 'object.qualityAnnotations'in flat.columns:
#     flat_7=pd.json_normalize(data, record_path=['object','qualityAnnotations'])
# else:
#     flat_7=pd.DataFrame()

In [34]:
# #here I select edmDataProvider,edmProvider,edmRights
# flat_1_filter=flat_1.drop(['about','edmIsShownAt','edmUgc','aggregatedCHO','webResources'], axis=1)
# #here I select textAttributionSnippet from webresources
# flat_3_web_res_filt=flat_3.drop(['about','htmlAttributionSnippet','ebucoreHasMimeType','ebucoreFileByteSize','rdfType'], axis=1)
# #Here I select the concepts
# flat_4_filt=flat_4.drop(['about','latitude','longitude','owlSameAs'], axis=1)
# #Here I select the proider proxy
# flat_6_filt=flat_6.iloc[[1]].drop(['about', 'proxyIn', 'proxyFor' ],axis=1).reset_index(drop=True)
# flat_6_filt

In [35]:
flat_3

Unnamed: 0,about,latitude,longitude,owlSameAs,prefLabel.en
0,#Place:1466,55.95,-3.2167,[http://vocab.getty.edu/tgn/7009546],"[Europe, United Kingdom, Scotland, Edinburgh, ..."


In [37]:
df=pd.concat([flat_1 ,flat_2,flat_3,flat_4,flat_5_filt],axis=1)
df

Unnamed: 0,about,edmIsShownBy,edmIsShownAt,edmObject,edmUgc,aggregatedCHO,webResources,edmDataProvider.def,edmProvider.def,edmRights.def,about.1,prefLabel.de,prefLabel.fi,prefLabel.ru,prefLabel.pt,prefLabel.el,prefLabel.en,prefLabel.hr,prefLabel.it,prefLabel.fr,prefLabel.es,prefLabel.ga,prefLabel.pl,prefLabel.ca,prefLabel.nl,about.2,latitude,longitude,owlSameAs,prefLabel.en.1,about.3,about.4,proxyIn,proxyFor,edmType,europeanaProxy,dcLanguage.def,dctermsIssued.def,year.def,dcContributor.def,dcFormat.def,dcIdentifier.def,dcPublisher.def,dcSubject.def,dcTitle.def,dcType.def,edmCurrentLocation.def
0,/aggregation/provider/99/_Resource_103655663,https://deriv.nls.uk/dcn23/1039/0013/103900138...,https://digital.nls.uk/103655663,https://deriv.nls.uk/dcn30/1043/5077/104350772...,False,/item/99/_Resource_103655663,[{'about': 'https://deriv.nls.uk/dcn30/1043/50...,[http://data.europeana.eu/organization/1482250...,[http://data.europeana.eu/organization/1482250...,[http://rightsstatements.org/vocab/CNE/1.0/],http://data.europeana.eu/organization/14822500...,[National Library of Scotland],[Skotlannin kansalliskirjasto],[Национальная библиотека Шотландии],[Biblioteca Nacional da Escócia],[Εθνική Βιβλιοθήκη της Σκωτίας],[National Library of Scotland],[Škotska nacionalna knjižnica],[National Library of Scotland],[bibliothèque nationale d'Écosse],[Biblioteca Nacional de Escocia],[Leabharlann Náiseanta na h-Alba],[Biblioteka Narodowa Szkocji],[National Library of Scotland],[National Library of Scotland],#Place:1466,55.95,-3.2167,[http://vocab.getty.edu/tgn/7009546],"[Europe, United Kingdom, Scotland, Edinburgh, ...",/item/99/_Resource_103655663,/proxy/provider/99/_Resource_103655663,[/aggregation/provider/99/_Resource_103655663],/item/99/_Resource_103655663,TEXT,False,[eng],[1875],,"[#Name:25125, #Name:25125]",[1 online resource],[#Resource:103655663],[#Name:22693],"[#Keyword:1919, #Keyword:3594, #Keyword:1964, ...",[Ladies' Edinburgh Debating Society publicatio...,[Periodicals],[http://sws.geonames.org/2650225]


In [25]:
flat_3

Unnamed: 0,about,textAttributionSnippet,htmlAttributionSnippet,ebucoreHasMimeType,ebucoreFileByteSize,ebucoreWidth,ebucoreHeight,edmHasColorSpace,edmComponentColor,ebucoreOrientation,rdfType,edmSpatialResolution,webResourceDcRights.def,webResourceEdmRights.def,dcFormat.def,dcSource.def,dcCreator.def
0,https://deriv.nls.uk/dcn30/1043/5077/104350772...,Ladies' Edinburgh Debating Society publication...,<link rel='stylesheet' type='text/css' href='h...,image/jpeg,117819,1000.0,1691.0,sRGB,"[#DCDCDC, #FFE4C4, #F5DEB3, #EEE8AA, #D3D3D3, ...",portrait,,,,,,,
1,https://deriv.nls.uk/dcn23/1039/0013/103900138...,Ladies' Edinburgh Debating Society publication...,<link rel='stylesheet' type='text/css' href='h...,application/pdf,13791933,,,,,,http://www.europeana.eu/schemas/edm/FullTextRe...,0.0,[The work is likely to be in the public domain...,[http://rightsstatements.org/vocab/CNE/1.0/],[pdf],[#Resource:103655663],[National Library of Scotland]
2,https://digital.nls.uk/103655663,Ladies' Edinburgh Debating Society publication...,<link rel='stylesheet' type='text/css' href='h...,text/html,58668,,,,,,http://www.europeana.eu/schemas/edm/FullTextRe...,,,,,,


In [None]:
query= '(wr_dcterms_isReferencedBy:en*)'
df_search=search_api(query)[0]
data=apis.record(f'{df_search}')
df_0=pd.json_normalize(data,['object','aggregations'])
# df_1=pd.json_normalize(df_0,[)

In [4]:
data=apis.record('/09903/FFF7C96578DEEAA154DC865CFF18ADE808258C8F')
data

{'apikey': 'api2demo',
 'success': True,
 'statsDuration': 235,
 'requestNumber': 999,
 'object': {'about': '/09903/FFF7C96578DEEAA154DC865CFF18ADE808258C8F',
  'aggregations': [{'about': '/aggregation/provider/09903/FFF7C96578DEEAA154DC865CFF18ADE808258C8F',
    'edmDataProvider': {'def': ['http://data.europeana.eu/organization/1482250000004509131']},
    'edmIsShownAt': 'http://xml.memovs.ch/s023b-001.xml',
    'edmProvider': {'def': ['http://data.europeana.eu/organization/1482250000004509131']},
    'edmRights': {'def': ['http://rightsstatements.org/vocab/InC/1.0/']},
    'edmUgc': 'false',
    'aggregatedCHO': '/item/09903/FFF7C96578DEEAA154DC865CFF18ADE808258C8F',
    'webResources': [{'about': 'http://xml.memovs.ch/s023b-001.xml',
      'textAttributionSnippet': 'Fuite vers la Birmanie (2/21) - https://www.europeana.eu/item/09903/FFF7C96578DEEAA154DC865CFF18ADE808258C8F. Florey, Paul-André. Médiathèque Valais - Martigny - http://xml.memovs.ch/s023b-001.xml. In Copyright - http://

In [319]:
response = apis.search(
    query = 'proxy_dc_description.nl:*',
    rows = 5, 
    profile='standard'
    )

In [310]:
df=pd.json_normalize(response)
df                    

Unnamed: 0,apikey,success,requestNumber,itemsCount,totalResults,nextCursor,items,url,params.wskey,params.query,params.qf,params.reusability,params.media,params.thumbnail,params.landingpage,params.colourpalette,params.theme,params.sort,params.profile,params.rows,params.cursor,params.callback,params.facet
0,api2demo,True,999,5,1866111,AoEuLzkwNDAyL1NLX0NfODg=,"[{'completeness': 10, 'country': ['Netherlands...",https://api.europeana.eu/record/v2/search.json...,api2demo,proxy_dc_description.nl:*,,,,,,,,europeana_id,standard,5,*,,


In [322]:
df_items=pd.json_normalize(response, record_path=['items'])

In [332]:
pd.json_normalize(df_items.edmConceptLabel.loc[0])

Unnamed: 0,def
0,Malerei
1,चित्रकला
2,Malerkunst
3,Живопись
4,Жывапіс
5,Maalaustaide
6,Pintura
7,Живопис
8,Tapyba
9,Glezniecība


In [317]:
df_items_=pd.json_normalize(df_items)


0
1
2
3
4
5
6
7
8
9
10


# Conclusions

Here we saw how to combine Search APi and Record API to retrieve field values that are not served by the Saerch API alone. The general idea is that the record API serves all results while the Search API only a fraction of it to optmize perfomance. 