# Example of sourcing data combining Search and Record API

This notebook leverages the API python wrapper to build a series of API calls with the Search and Record API.
The idea to use both Search API and Record API is because  looking up results only with the Search API does not always allow to retrieve all results, some fields are not shown,while the Record API related to an object contains all fields.
For example, not all multilingual fields are served in the Search API.
If we wanted to retrieve data for multilingual fields that are not served by the Search API an option is to use the Search API in combination with the Record API to retrieve all fields for a specific item.


This notebook leverages the SEARCH and RECORD API and the Python Wrapper developed by the RnD team

You can find more details about the search and the record APIs in the following links:

*   https://pro.europeana.eu/page/search
*   https://pro.europeana.eu/page/record

Those APIs serve data using the Europeana Data Model: https://pro.europeana.eu/page/intro#edm


In [143]:
#Importing libraries
import pandas as pd
import pyeuropeana.apis as apis
import pyeuropeana.utils as utils
import os
pd.set_option("display.max_rows", 400)
pd.set_option("display.max_columns", 400)
pd.options.mode.chained_assignment = None

 # Combination Search and Record API to retrieve field values

Here  example we look for the field value that for  proxy_dc_type.it
proxy_dc_type is indexed in Solr but the results are not returned by the Search API. 
Therefore to retrive the value of it fields for a particolar record we combine the use of record and search API.

In [14]:
#setting enviroment variable
os.environ['EUROPEANA_API_KEY'] = 'api2demo'

In [22]:
#Function to extract europeana_id numbers that correspond to a certain query
def search_api(query):
    response = apis.search(
    query = query,
    rows = 5, 
    profile='rich'
    )
    df_search=utils.search2df(response).europeana_id
    return df_search

In [144]:
#Function to extract data that correspond to the id numbers found in search_api function
def record_api(items_id):
    df_list=[]
    for item in items_id:
        data=apis.record(f'{item}')
        df_0=pd.json_normalize(data,['object','proxies'])
        df_proxy_provider=df_0.iloc[1] #selcting provider proxy - there are the info I am interested in
        df_proxy_provider=pd.DataFrame(df_proxy_provider)
        df_proxy_provider=df_proxy_provider.transpose()
        df_list.append(df_proxy_provider)
    df_proxy_tot = pd.concat(df_list, ignore_index=True, axis=0)
    return df_proxy_tot

In [142]:
#Example of query--> id_numbers that correspond to it --> using record api to extract data on proxy_provider
# and place them in dataframe format
query= '(proxy_dc_type.en:* AND proxy_dc_type.it:*)' #dc_type not served in Search API results
df_search=search_api(query)
df_proxy_tot=record_api(df_search)
df_proxy_tot.head(2)

Unnamed: 0,about,proxyIn,proxyFor,edmType,europeanaProxy,dcIdentifier.def,dcLanguage.def,dcSubject.def,dcType.def,dcContributor.def,dcCreator.def,dcDate.def,dcDescription.it,dcFormat.def,dcRights.def,dcTitle.it,dcType.en,dcType.it,dctermsAlternative.def,dctermsAlternative.it,dctermsIsPartOf.def,dctermsProvenance.def,dcFormat.it,dcRights.it,dctermsExtent.def,dctermsIsPartOf.en,dctermsIsReferencedBy.it,dctermsSpatial.def,edmCurrentLocation.it,dcDate.it,dcDescription.en,dcTitle.en
0,/proxy/provider/40/CNMD0000249225,[/aggregation/provider/40/CNMD0000249225],/item/40/CNMD0000249225,TEXT,False,,[lat],"[Uccelli, Birds, Animali, Animals, Ornitologia...",,"[Cartari, Carlo, Della Rovere, Francesco Maria...","[Djunkovskoy, Stefano]",[1601-1631],[1601-1631 data stimata; Catalogo anonimo e an...,[cartaceo ; cc. II + 302 + II],[Biblioteca universitaria Alessandrina - Roma],"[Roma, Biblioteca universitaria Alessandrina, ...","[Manuscript, Monograph]","[Manoscritto, Monografia]","[Continet hic picturas diligentissimas avium, ...",[Ornitologico],[Fondo Manoscritti della Biblioteca universita...,[Convento del Santissimo Crocifisso dei chieri...,,,,,,,,,,
1,/proxy/provider/288/work_70019,[/aggregation/provider/288/work_70019],/item/288/work_70019,IMAGE,False,[work_70019],[zxx],"[http://vocab.getty.edu/aat/300379558, http://...","[http://vocab.getty.edu/aat/300046001, http://...",,,,,,,[DISTANZIATORE DI COLLANA],"[necklace, DISTANZIATORE DI COLLANA]",[opere],,,,,[BRONZO],[MUSEO NAZIONALE PREISTORICO ETNOGRAFICO L. PI...,"[altezza: cm 32, diametro: cm 0.7]",[Europeana Archaeology],[Scheda ICCD RA: 12-00870752],[http://sws.geonames.org/8015122],[MUSEO NAZIONALE PREISTORICO ETNOGRAFICO L. PI...,,,


# Conclusions

Here we saw how to combine Search APi and Record API to retrieve field values that are not served by the Saerch API alone. The general idea is that the record API serves all results while the Search API only a fraction of it to optmize perfomance. 