<a href="https://colab.research.google.com/github/PE-KR/PE-KR.github.io/blob/master/5_4_Knowledge_Graph_Programming_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Scientific Birthdays Example**

This is the python notebook example for lecture 5.4 Knowledge Graph Programming, of the OpenHPI lecture "Knowledge Graphs 2020".

*Please make a copy of this notebook to try out your own adaptions via "File -> Save Copy in Drive"*

First, we have to install the **sparqlwrapper library** before we can use it with the notebook.

In [None]:
!pip install -q sparqlwrapper    #install SPARQLwrapper

[K     |████████████████████████████████| 235kB 6.7MB/s 
[K     |████████████████████████████████| 51kB 4.5MB/s 
[?25h

We are going to use a few libraries:



*   **datetime** for date formatting and interpretation
*   **SPARQLWrapper** to execute SPARQL queries and to import the results into python

Thus, we will import them now.



In [None]:
from datetime import datetime
from SPARQLWrapper import SPARQLWrapper, JSON, XML, N3, RDF

We will use DBpedia (http://dbpedia.org/sparql) as our SPARQL endpoint

In [None]:
sparql = SPARQLWrapper("http://dbpedia.org/sparql") #determine SPARQL endpoint

Next comes the query example from the lecture and its execution

In [None]:
#SPARQL query to be executed
sparql.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>

Select distinct ?birthdate ?thumbnail ?scientist ?name ?description  WHERE {
?scientist rdf:type dbo:Scientist ;
        dbo:birthDate ?birthdate ;
        rdfs:label ?name ;
        rdfs:comment ?description 
 FILTER ((lang(?name)="en")&&(lang(?description)="en")&&(STRLEN(STR(?birthdate))>6)&&(SUBSTR(STR(?birthdate),6)=SUBSTR(STR(bif:curdate('')),6))) .
 OPTIONAL { ?scientist dbo:thumbnail ?thumbnail . }
} ORDER BY ?birthdate
""")

sparql.setReturnFormat(JSON)   # Return format is JSON
results = sparql.query().convert()   # execute SPARQL query and write result to "results"

The results are now formatted in HTML encoding to be displayed nicely in a browser

In [None]:
# Create HTML output
print('<html><head><title>Scientific Birthdays of Today</title></head>')

#extract Weekday %A / Month %B / Day of the Month %d by formatting today's date accordingly
date = datetime.today().strftime("%A  %B %d")
print('<body><h1>Scientific Birthdays of {}</h1>'.format(date))

print('<ul>')

for result in results["results"]["bindings"]:
	if ("scientist" in result):
	    #Create a Wikipedia Link
  		wikiurl = "http://en.wikipedia.org/wiki/" + result["scientist"]["value"].split('/')[-1]
	else:
		wikiurl = 'NONE'  
	if ("name" in result):
  		name = result["name"]["value"]
	else:
		name = 'NONE'  		
	if ("birthdate" in result):
		birthdate = result["birthdate"]["value"]
	else:
		birthdate = 'NONE'        
	if ("description" in result):
		description = result["description"]["value"]
	else:
		description = ' '  
	if ("thumbnail" in result):
		pic = result["thumbnail"]["value"]
	else:
		pic = 'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b0/Question_mark2.svg/71px-Question_mark2.svg.png'        	


	#parse date as datetime
	dt = datetime.strptime(birthdate, '%Y-%m-%d')
  
#	print '<li><b>{}</b> --  <a href="{}">{}</a>, {} </li>'.format(dt.year, url, name, description)
	print('<li><b>{}</b> -- <img src="{}" height="60px"> <a href="{}">{}</a>, {} </li>'.format(dt.year, pic.replace("300", "60"), wikiurl, name, description))

print('</ul>')
print('</body></html>')

<html><head><title>Scientific Birthdays of Today</title></head>
<body><h1>Scientific Birthdays of Saturday  November 28</h1>
<ul>
<li><b>1700</b> -- <img src="http://commons.wikimedia.org/wiki/Special:FilePath/The_Reverend_Nathaniel_Bliss.jpg?width=60" height="60px"> <a href="http://en.wikipedia.org/wiki/Nathaniel_Bliss">Nathaniel Bliss</a>, The Reverend Nathaniel Bliss (28 November 1700 – 2 September 1764) was an English astronomer of the 18th century, serving as Britain's fourth Astronomer Royal between 1762 and 1764. </li>
<li><b>1772</b> -- <img src="http://commons.wikimedia.org/wiki/Special:FilePath/Luke_Howard.jpg?width=60" height="60px"> <a href="http://en.wikipedia.org/wiki/Luke_Howard">Luke Howard</a>, Luke Howard, FRS (28 November 1772 – 21 March 1864) was a British manufacturing chemist and an amateur meteorologist with broad interests in science. His lasting contribution to science is a nomenclature system for clouds, which he proposed in an 1802 presentation to the Askesia

Now, do exactly the same, but write output into a file on your local computer (to be displayed in your browser)

In [None]:
from google.colab import files

with open('birthday.html', 'w') as f:
	# Create HTML output
	f.write('<html><head><title>Scientific Birthdays of Today</title></head>')

	#extract Weekday %A / Month %B / Day of the Month %d by formatting today's date accordingly
	date = datetime.today().strftime("%A  %B %d")
	f.write('<body><h1>Scientific Birthdays of {}</h1>'.format(date))

	f.write('<ul>')

	for result in results["results"]["bindings"]:
		if ("scientist" in result):
			#Create a Wikipedia Link
			wikiurl = "http://en.wikipedia.org/wiki/" + result["scientist"]["value"].split('/')[-1]
		else:
			wikiurl = 'NONE'  
		if ("name" in result):
			name = result["name"]["value"]
		else:
			name = 'NONE'  		
		if ("birthdate" in result):
			birthdate = result["birthdate"]["value"]
		else:
			birthdate = 'NONE'        
		if ("description" in result):
			description = result["description"]["value"]
		else:
			description = ' '  
		if ("thumbnail" in result):
			pic = result["thumbnail"]["value"]
		else:
			pic = 'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b0/Question_mark2.svg/71px-Question_mark2.svg.png'        	

		#parse date as datetime
		dt = datetime.strptime(birthdate, '%Y-%m-%d')
  
		f.write('<li><b>{}</b> <ul><li><img src="{}" height="60px"> <a href="{}">{}</a>, {} </li></ul></li>'.format(dt.year, pic.replace("300", "60"), wikiurl, name, description))
	f.write('</ul>')
	f.write('</body></html>')
files.download('birthday.html')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Probably, it looks abit nicer in a table...

In [None]:
from google.colab import files

with open('birthday.html', 'w') as f:
	# Create HTML output
	f.write('<html><head><title>Scientific Birthdays of Today</title></head>')

	#extract Weekday %A / Month %B / Day of the Month %d by formatting today's date accordingly
	date = datetime.today().strftime("%A  %B %d")
	# f.write('<body><h1>Scientific Birthdays of {}</h1>'.format(date))

	f.write('<table style="width:75%">')

	for result in results["results"]["bindings"]:
		if ("scientist" in result):
			#Create a Wikipedia Link
			wikiurl = "http://en.wikipedia.org/wiki/" + result["scientist"]["value"].split('/')[-1]
		else:
			wikiurl = 'NONE'  
		if ("name" in result):
			name = result["name"]["value"]
		else:
			name = 'NONE'  		
		if ("birthdate" in result):
			birthdate = result["birthdate"]["value"]
		else:
			birthdate = 'NONE'        
		if ("description" in result):
			description = result["description"]["value"]
		else:
			description = ' '  
		if ("thumbnail" in result):
			pic = result["thumbnail"]["value"]
		else:
			pic = 'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b0/Question_mark2.svg/71px-Question_mark2.svg.png'   

		#parse date as datetime
		dt = datetime.strptime(birthdate, '%Y-%m-%d')
  
		f.write('<tr><td><b>{}</b></td> <td style="text-center: justify;"><img src="{}" height="60px"></td><td style="text-align: justify;"><a href="{}">{}</a>, {} </td></tr>'.format(dt.year, pic.replace("300", "60"), wikiurl, name, description))
	f.write('</table>')
	f.write('</body></html>')
files.download('birthday.html')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

The html file can also directly be displayed in colab. Thanks to [KWB](https://open.hpi.de/users/1d38cda1-5e2c-440b-92f1-5ca9f14f56bd).

In [None]:
import IPython
IPython.display.HTML('birthday.html')

0,1,2
1700,,"Nathaniel Bliss, The Reverend Nathaniel Bliss (28 November 1700 – 2 September 1764) was an English astronomer of the 18th century, serving as Britain's fourth Astronomer Royal between 1762 and 1764."
1772,,"Luke Howard, Luke Howard, FRS (28 November 1772 – 21 March 1864) was a British manufacturing chemist and an amateur meteorologist with broad interests in science. His lasting contribution to science is a nomenclature system for clouds, which he proposed in an 1802 presentation to the Askesian Society."
1805,,"John Lloyd Stephens, John Lloyd Stephens was born November 28, 1805, in the township of Shrewsbury, New Jersey. He was the second son of Benjamin Stephens, a successful New Jersey merchant, and Clemence Lloyd, daughter of an eminent local judge. The following year the family moved to New York City. There Stephens received an education in the Classics at two privately tutored schools. At the early age of 13 he enrolled at Columbia College, graduating at the top of his class four years later in 1822."
1831,,"Robert Bartholow, Robert Bartholow or Roberts Bartholow (November 28, 1831 – May 10, 1904) was an American physician from New Windsor, Maryland. He earned his degree in medicine from the University of Maryland in 1852. From 1855 to 1864 he was a surgeon in the U.S. Army. From 1864 to 1879 he was a professor at the Medical College of Ohio in Cincinnati. Afterwards he was a professor at the Jefferson Medical College in Philadelphia."
1834,,"Étienne Laspeyres, Ernst Louis Étienne Laspeyres (28 November 1834 – 4 August 1913) was Professor ordinarius of economics and statistics or State Sciences and cameralistics (public finance and administration) in Basel, Riga, Dorpat (now Tartu), Karlsruhe, and finally for 26 years in Gießen. Laspeyres was the scion of a Huguenot family of originally Gascon descent which had settled in Berlin in the 17th century, and he emphasised the Occitan pronunciation of his name as a link to his Gascon origins."
1837,,"Noah Miller Glatfelter, Dr. Noah Miller Glatfelter was an American physician, genealogist, and amateur botanist and mycologist who lived in St. Louis, Missouri between 1867 and 1911. He served as a surgeon for the Union Army during the American Civil War, and was in private practice as a physician from the 1870s to 1907. In retirement his interests turned to botany and mycology; seven fungi have been named for him."
1848,,"Paul Charles Dubois, He studied medicine at the University of Bern, and in 1876 was a general practitioner of medicine in Bern. He was interested in psychosomatic medicine, eventually gaining a reputation as a highly regarded psychotherapist. In 1902 he became a professor of neuropathology at Bern. Dubois was influenced by the writings of German psychiatrist Johann Christian August Heinroth (1773–1843). Dubois is known for the introduction of ""persuasion therapy"", a process that employed a rational approach for treatment of neurotic disorders. Within this discipline, he developed a psychotherapeutic methodology that was a form of Socratic dialogue, using the doctor-patient relationship as a means to persuade the patient to change his/her behavior. He believed it was necessary to appeal to a patient's intellect"
1851,,"Philip A. Herfort, Philip Adolph Herfort (November 28, 1851 – March 24, 1921) was a German violinist and orchestra leader. He was born in Berlin, Germany to Jewish parents, Adolph (Aron) Herfort (1818–1900) and Clara Herfort (1830–1907) née Maass. Philip Herfort married Antonie Theodore Johanne Lupprian on December 15, 1877 in New York City and fathered four children: Sophie (1879–1966), Paul (1880–1967), Gunther (1888–1986), and Walter (1886–1887). Philip Herfort died on March 24, 1921 in Brooklyn, New York and is buried in Green-Wood Cemetery."
1854,,"Gottlieb Haberlandt, Gottlieb Haberlandt (28 November 1854, Ungarisch-Altenburg (present day Magyaróvár) – 30 January 1945, Berlin) was an Austrian botanist. He was the son of European 'soybean' pioneer Professor Friedrich J. Haberlandt. His son Ludwig Haberlandt was an early reproductive physiologist now given credit as the 'grandfather' of the birth control pill, the pill. The more efficient C-4 photosynthesis in land plants depends on a specialized Kranz (German for wreath) leaf anatomy History of C3 : C4 photosynthesis research first described by Gottlieb Haberlandt in 1904"
1858,,"William Stanley Jr., William Stanley Jr. (November 28, 1858 – May 14, 1916) was an American physicist born in Brooklyn, New York. In his career, he obtained 129 patents covering a variety of electric devices. In 1913, he also patented an all-steel vacuum bottle, and formed the Stanley Bottle Company."
