# Explanation

This is a Jupyter Notebook to explain the function `check_duplicates.py` and `extract_data.py`. To use a specific user case, we assume that a person wants to donate some books to the library. We have provided the doner with a Excel template with a series of columns already defined. The person sends us this table filled. This can be found in `Fachreferats-Toolbox/data/input/Anfragen_Geschenk_Muster_Spendende_Beispiele.xlsx`.


To run this, you can download the repository and run this Jupyter Notebook. However, your computer should have Python and specific Python libraries installed (for example lxml).

If you have access to it, you can use the Jupyter Notebook service offered by the GWDG: https://jupyter-cloud.gwdg.de/welcome/

# Import

## Libraries and Functions

In [1]:
import sys
import os
import glob
from lxml import etree
import pandas as pd
import numpy as np


In [2]:
sys.path.append(os.path.abspath("./../"))

In [3]:
from fachreferats_functions import check_duplicates, extract_data, clean_data



## Data

In [4]:
books_donated = pd.read_excel("./../../data/input/Spenden_Beispiele.xlsx").dropna(how='all', axis=1)

In [5]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag
0,The picture of Dorian Gray,Oscar,Wilde,1994,978-014-06-2322-2,,
1,Gustav Klimt,,,2008,978-140-27-5920-8,,
2,La familia de Pascual Duarte,Camilo José,Cela,2009,978-842-33-3904-4,,
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole


In [6]:
books_donated = clean_data.clean_data(books_donated)

In [7]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,,
1,Gustav Klimt,,,2008,9781402759208.0,,
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,,
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole


In [8]:
# Wenn die ISBNs der Tabelle als Zahl mit Exponent formatiert ist, nutze:
#books_donated["ISBN"] = books_donated["ISBN"].fillna(0).astype('Int64').astype(str)

# Wenn die ISBNs der Tabelle als Text formatiert ist, nutze:
# books_donated["ISBN"] = books_donated["ISBN"].astype(str)


In [9]:
books_donated["Nachname_Autor"].fillna("", inplace=True)

In [10]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,,
1,Gustav Klimt,,,2008,9781402759208.0,,
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,,
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole


# Extract Data

In the following cells, we add several fields to the table using the ISBN when we can find it, or the title if not.

In [11]:
books_donated = extract_data.extract_fields_with_isbn( books_donated)

In [12]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,


It can be seen that there is now a column called `nach_ISBN_Titel`. This column is added to the datasheet to control that the book that is being donated and the book, from which we extract information from the catalog, are the same.

In [13]:
books_donated = extract_data.extract_fields_with_title( books_donated)

In [14]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32


In [15]:
books_donated = extract_data.extract_fields_with_title_author( books_donated, name_column_title = "Titel", name_column_author = "Nachname_Autor")

In [16]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,


These functions have added to the donner's table following fields:

- From those books that had ISBN:
    - Place
    - Publisher
    - A score about the title from the catalogue and the in origin given title (1 = perfect matching; 0 = no matching). For further documentation, see section "ratio" in: https://docs.python.org/3/library/difflib.html.
- For all:
    - Language of the text
    - Number of libraries in K10 with titles with this ISBN
    - Number of libraries in K10 with titles with this title
    - Number of libraries in K10 with titles with this title and this author 



# Check copies with ISBN

Now we want to know whether these books are already at the Göttinge Library. For that, we will use both the ISBN, the title, and author and title in combination.

In [17]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,


In [18]:
books_donated = check_duplicates.check_duplicate_with_isbn( books_donated, 
    name_column_isbn = "ISBN",
    name_column_title = "Titel",
    verbose = True,
    )

http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9780140623222&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
The picture of Dorian Gray 9780140623222 0
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9781402759208&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
Gustav Klimt 9781402759208 0
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9788423339044&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
La familia de Pascual Duarte 9788423339044 1


In [19]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10,nach_ISBN_Bestand_Göttingen,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0,1.0,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0,,,,
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0,,,,
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,,,,,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0,,,,
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,,,,,


In [20]:
books_donated = check_duplicates.check_duplicate_with_isbn( books_donated, 
    name_column_isbn = "ISBN",
    name_column_title = "Titel",
    verbose = True,
    )

http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9780140623222&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
The picture of Dorian Gray 9780140623222 0
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9781402759208&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
Gustav Klimt 9781402759208 0
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.isb=9788423339044&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
La familia de Pascual Duarte 9788423339044 1


In [21]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10,nach_ISBN_Bestand_Göttingen,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0,1.0,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0,,,,
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0,,,,
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,,,,,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0,,,,
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,,,,,


In [22]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10,nach_ISBN_Bestand_Göttingen,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0,1.0,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0,,,,
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0,,,,
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,,,,,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0,,,,
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,,,,,


In [23]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10,nach_ISBN_Bestand_Göttingen,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0,1.0,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0,,,,
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0,,,,
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,,,,,
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0,,,,
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,,,,,


## Check copies with Title
Checking whether the texts are at the Göttinge Library with the information from the title.

In [24]:
books_donated = check_duplicates.check_duplicate_with_title( books_donated, 
    name_column_title = "Titel",
    verbose = True,
    )

The picture of Dorian Gray
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="The picture of Dorian Gray"&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['42']
['LS1', 'FMAG', 'LS1', '7/029', 'FMAG', 'FMAG', 'FMAG', '7/029', 'FMAG', 'FMAG']
Gustav Klimt
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="Gustav Klimt"&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['32']
['7/055', '7/055', 'LS1', 'BBW-KJL']
La familia de Pascual Duarte
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="La familia de Pascual Duarte"&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['8']
['BBK-ROM', 'BBK-ROM', 'BBK-ROM', 'BBK-ROM', 'BBK-ROM']
Auf der Eidechsburg
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="Auf der Eidechsburg"&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['0']
[]
Bayerisches Kochbuch
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="Bayerisc

In [25]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,nach_ISBN_Bestand_K10,nach_Titel_Bestand_K10,nach_Titel_Autor_Bestand_K10,nach_ISBN_Bestand_Göttingen,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK,nach_Titel_Bestand_Göttingen,nach_Titel_Ort_Göttingen,error_nach_title
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,3.0,175,220.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,42,FMAG|7/029|LS1,1.0
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,2.0,208,,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,32,BBW-KJL|7/055|LS1,1.0
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,2.0,143,163.0,1.0,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,8,BBK-ROM,1.0
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,,1,1.0,,,,,0,,1.0
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,,41,22.0,,,,,1,HG-MAG,1.0
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,,6,,,,,,0,,1.0
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,,107,98.0,,,,,4,BBK-ROM|LS1,1.0
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,,32,,,,,,2,,1.0


## Check copies with Title and Author
Checking whether the texts are at the Göttinge Library with the information from the title.

In [26]:
books_donated = check_duplicates.check_duplicate_with_title_author( books_donated, name_column_title = "Titel", name_column_author = "Nachname_Autor", verbose = True)

The picture of Dorian Gray Wilde
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="The picture of Dorian Gray" and pica.per=Wilde&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['29']
['LS1', 'LS1', '7/029', 'FMAG', 'FMAG', '7/029', 'FMAG', 'FMAG', 'FMAG', 'LS1', '7/029', 'FMAG']
Gustav Klimt 
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="Gustav Klimt" and pica.per=&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
[]
[]
La familia de Pascual Duarte Cela
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="La familia de Pascual Duarte" and pica.per=Cela&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['5']
['BBK-ROM', 'BBK-ROM', 'BBK-ROM', 'BBK-ROM']
Auf der Eidechsburg Tanner
http://sru.k10plus.de/opac-de-7!rec=1?version=1.1&query=pica.tit="Auf der Eidechsburg" and pica.per=Tanner&operation=searchRetrieve&maximumRecords=10&recordSchema=picaxml
['0']
[]
Bayerisches Kochbuch Hofmann
http://sr

In [27]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,...,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK,nach_Titel_Bestand_Göttingen,nach_Titel_Ort_Göttingen,error_nach_title,nach_Titel_Nachname_Autor_Bestand_Göttingen,nach_Titel_Nachname_Autor_Ort_Göttingen,nach_Titel_Nachname_Autor_Medium_Göttingen,nach_Titel_Nachname_Autor_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,...,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,42,FMAG|7/029|LS1,1.0,29.0,FMAG|7/029|LS1,Aaukr|Aau|Aaukf,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,...,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,32,BBW-KJL|7/055|LS1,1.0,,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,...,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,8,BBK-ROM,1.0,5.0,BBK-ROM,Aau|Aan|Aar,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,...,,,,0,,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,...,,,,1,HG-MAG,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,...,,,,0,,1.0,,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,...,,,,4,BBK-ROM|LS1,1.0,4.0,BBK-ROM|LS1,Aau|AFr|Afn,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,...,,,,2,,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...


# Results

In [28]:
books_donated

Unnamed: 0,Titel,Vorname_Autor,Nachname_Autor,Erscheinungsjahr,ISBN,Erscheinungsort,Verlag,nach_ISBN_Titel,nach_ISBN_Sprache_Text,Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score,...,nach_ISBN_Ort_Göttingen,nach_ISBN_Medium_Göttingen,nach_ISBN_URL_GUK,nach_Titel_Bestand_Göttingen,nach_Titel_Ort_Göttingen,error_nach_title,nach_Titel_Nachname_Autor_Bestand_Göttingen,nach_Titel_Nachname_Autor_Ort_Göttingen,nach_Titel_Nachname_Autor_Medium_Göttingen,nach_Titel_Nachname_Autor_URL_GUK
0,The picture of Dorian Gray,Oscar,Wilde,1994,9780140623222.0,London,Penguin Books,The @picture of Dorian Gray,eng,0.98,...,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,42,FMAG|7/029|LS1,1.0,29.0,FMAG|7/029|LS1,Aaukr|Aau|Aaukf,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
1,Gustav Klimt,,,2008,9781402759208.0,New York [u.a.],Sterling,Gustav Klimt - Art Nouveau visionary,eng,0.5,...,,,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,32,BBW-KJL|7/055|LS1,1.0,,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
2,La familia de Pascual Duarte,Camilo José,Cela,2009,9788423339044.0,Barcelona,Ed. Destino,La @familia de Pascual Duarte,spa,0.98,...,BBK-ROM,Aau,https://opac.sub.uni-goettingen.de/DB=1/SET=6/...,8,BBK-ROM,1.0,5.0,BBK-ROM,Aau|Aan|Aar,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
3,Auf der Eidechsburg,Ilse-Dore,Tanner,1938?,,Leipzig,A. H. Payne,,,,...,,,,0,,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
4,Bayerisches Kochbuch,Maria,Hofmann,1950,,München,Birken,,,,...,,,,1,HG-MAG,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
5,Centenaire de l‘Impressionisme,,,1974,,Paris,Musées nationaux,,,,...,,,,0,,1.0,,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
6,Les parents terribles,Jean,Cocteau,1938,,Paris,Gallimard,,,,...,,,,4,BBK-ROM|LS1,1.0,4.0,BBK-ROM|LS1,Aau|AFr|Afn,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...
7,Straightforward Statistics,James D.,Evans,1996,,Pacific Grove,Brooks/Cole,,,,...,,,,2,,1.0,0.0,,,https://opac.sub.uni-goettingen.de/DB=1/SET=2/...


The column `nach_ISBN_Bestand_Göttingen` shows that from the three books for which the donner has given a ISBN, the SUB already has one title (_La familia de Pascual Duarte_) with exactly this ISBN.

The column `nach_Titel_Bestand_Göttingen` shows that 6 of the seven proposal for donnation are already in our catalogue.

However, the column `nach_Titel_Nachname_Autor_Bestand_Göttingen` shows that two of the books are NOT in the Göttingen Katalog when the name of the author is added to the search. That means probably that the library has books with this title, but from different authors.

In any case, the librarian gets also several columns with direct links to the catalogue with specific queries and they can further consider whether these books are accepted by the library or not.

The results can be exported as a table (Excel or Tab-Separated Values).

In [29]:
books_donated.to_excel("./../../data/output/Spenden_Beispiele.xlsx")

In [30]:
books_donated.to_csv("./../../data/output/Spenden_Beispiele.tsv", sep="\t")

In [31]:
books_donated.columns.tolist()

['Titel',
 'Vorname_Autor',
 'Nachname_Autor',
 'Erscheinungsjahr',
 'ISBN',
 'Erscheinungsort',
 'Verlag',
 'nach_ISBN_Titel',
 'nach_ISBN_Sprache_Text',
 'Titel_und_nach_ISBN_Titel_Ähnlichkeit_Score',
 'nach_ISBN_Bestand_K10',
 'nach_Titel_Bestand_K10',
 'nach_Titel_Autor_Bestand_K10',
 'nach_ISBN_Bestand_Göttingen',
 'nach_ISBN_Ort_Göttingen',
 'nach_ISBN_Medium_Göttingen',
 'nach_ISBN_URL_GUK',
 'nach_Titel_Bestand_Göttingen',
 'nach_Titel_Ort_Göttingen',
 'error_nach_title',
 'nach_Titel_Nachname_Autor_Bestand_Göttingen',
 'nach_Titel_Nachname_Autor_Ort_Göttingen',
 'nach_Titel_Nachname_Autor_Medium_Göttingen',
 'nach_Titel_Nachname_Autor_URL_GUK']