# Quickstart

KIF is a Wikidata-based framework for integrating knowledge sources.

This quickstart guide presents the basic API of KIF.

----

## Hello world!

We start by importing the `kif_lib` namespace:

In [1]:
from kif_lib import *

We'll also need the Wikidata vocabulary module `wd`:

In [2]:
from kif_lib.vocabulary import wd

Next, we create a KIF SPARQL store pointing to the official Wikidata query service:

In [3]:
kb = Store('sparql', 'https://query.wikidata.org/sparql')

A KIF store is an inteface to a knowledge source: it allows us to view the source as a set of Wikidata-like statements.

The `kb` store we just created is an interface to Wikidata itself.  We can use it, for example, to fetch from Wikidata three statements about Brazil:

In [4]:
it = kb.filter(subject=wd.Brazil, limit=3)
for stmt in it:
    display(stmt)

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [shares border with](http://www.wikidata.org/entity/P47)) (**Item** [Argentina](http://www.wikidata.org/entity/Q414))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [shares border with](http://www.wikidata.org/entity/P47)) (**Item** [Paraguay](http://www.wikidata.org/entity/Q733))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [shares border with](http://www.wikidata.org/entity/P47)) (**Item** [wd:Q142](http://www.wikidata.org/entity/Q142))))

## Filters

The `kb.filter(...)` call searches for statements in `kb` matching the restrictions `...`.

The result of a filter call is a (lazy) iterator `it` of statements:

In [5]:
it = kb.filter(subject=wd.Brazil)

We can advance `it` to obtain statements:

In [6]:
next(it)

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [shares border with](http://www.wikidata.org/entity/P47)) (**Item** [Argentina](http://www.wikidata.org/entity/Q414))))

If no `limit` argument is given to `kb.filter()`, the returned iterator contains *all* matching statements.

## Basic filters

We can filter statements by any combination of *subject*, *property*, and *value*.

For example:

In [7]:
# match any statement
next(kb.filter())

(**Statement** (**Item** [wd:Q141099](http://www.wikidata.org/entity/Q141099)) (**ValueSnak** (**Property** [part of the series](http://www.wikidata.org/entity/P179)) (**Item** [wd:Q435960](http://www.wikidata.org/entity/Q435960))))

In [8]:
# match statements with subject "Brazil" and property "official website"
next(kb.filter(subject=wd.Brazil, property=wd.official_website))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [official website](http://www.wikidata.org/entity/P856)) [https://www.gov.br](https://www.gov.br)))

In [9]:
# match statements with property "official website" and value "https://www.ibm.com/"
next(kb.filter(property=wd.official_website, value=IRI('https://www.ibm.com/')))

(**Statement** (**Item** [IBM](http://www.wikidata.org/entity/Q37156)) (**ValueSnak** (**Property** [official website](http://www.wikidata.org/entity/P856)) [https://www.ibm.com/](https://www.ibm.com/)))

In [10]:
# match statements with value "78.046950192 dalton"
next(kb.filter(value=Quantity('78.046950192', unit=wd.dalton)))

(**Statement** (**Item** [wd:Q83011151](http://www.wikidata.org/entity/Q83011151)) (**ValueSnak** (**Property** [mass](http://www.wikidata.org/entity/P2067)) (**Quantity** 78.046950192 (**Item** [dalton](http://www.wikidata.org/entity/Q483261)))))

We can also match statements having *some* (unknown) value:

In [11]:
next(kb.filter(snak=wd.date_of_birth.some_value()))

(**Statement** (**Item** [wd:Q7262](http://www.wikidata.org/entity/Q7262)) (**SomeValueSnak** (**Property** [date of birth](http://www.wikidata.org/entity/P569))))

Or *no* value:

In [12]:
next(kb.filter(snak=wd.date_of_death.no_value()))

(**Statement** (**Item** [wd:Q124958592](http://www.wikidata.org/entity/Q124958592)) (**NoValueSnak** (**Property** [date of death](http://www.wikidata.org/entity/P570))))

## Fingerprints (indirect ids)

So far, we have been using the symbolic aliases defined in the `wd` module to specify entities in filters:

In [13]:
display(wd.Brazil)
display(wd.continent)

(**Item** [Brazil](http://www.wikidata.org/entity/Q155))

(**Property** [continent](http://www.wikidata.org/entity/P30))

Alternatively, we can use their numeric Wikidata ids:

In [14]:
# match statements with subject Q155 (Brazil) and property P30 (continent)
next(kb.filter(subject=wd.Q(155), property=wd.P(30)))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [continent](http://www.wikidata.org/entity/P30)) (**Item** [South America](http://www.wikidata.org/entity/Q18))))

Sometimes, however, ids are not enough.  We might need to specify an entity indirectly by giving not its id but a property it satisfies.

In cases like this, we can use a *fingerprint*:

In [15]:
# match statemets whose subject "is a dog" and value "is a human"
next(kb.filter(subject=wd.instance_of(wd.dog), value=wd.instance_of(wd.human)))

(**Statement** (**Item** [wd:Q5270723](http://www.wikidata.org/entity/Q5270723)) (**ValueSnak** (**Property** [owned by](http://www.wikidata.org/entity/P127)) (**Item** [wd:Q935](http://www.wikidata.org/entity/Q935))))

Properties themselves can also be identified using fingerprints:

In [16]:
# match statements whose property is "equivalent to Schema.org's 'weight'"
next(kb.filter(property=wd.equivalent_property('https://schema.org/weight')))

(**Statement** (**Item** [wd:Q38824](http://www.wikidata.org/entity/Q38824)) (**ValueSnak** (**Property** [mass](http://www.wikidata.org/entity/P2067)) (**Quantity** 17 (**Item** [kilogram](http://www.wikidata.org/entity/Q11570)))))

The `-` operator can be used to invert the direction of the property used in the fingerprint:

In [17]:
# match statements whose subject is "the continent of Brazil"
next(kb.filter(subject=-(wd.continent(wd.Brazil))))

(**Statement** (**Item** [South America](http://www.wikidata.org/entity/Q18)) (**NoValueSnak** (**Property** [country](http://www.wikidata.org/entity/P17))))

## And-ing and or-ing fingeprints

Ids and fingerpints can be combined using the operators `&` (and) and `|` (or).

For example:

In [18]:
# match four statements such that:
# - subject is "Brazil" or "Argentina"
# - property is "continent" or "highest point"
it = kb.filter(subject=wd.Brazil | wd.Argentina, property=wd.continent | wd.highest_point, limit=4)
for stmt in it:
    display(stmt)

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [continent](http://www.wikidata.org/entity/P30)) (**Item** [South America](http://www.wikidata.org/entity/Q18))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [highest point](http://www.wikidata.org/entity/P610)) (**Item** [Pico da Neblina](http://www.wikidata.org/entity/Q739484))))

(**Statement** (**Item** [Argentina](http://www.wikidata.org/entity/Q414)) (**ValueSnak** (**Property** [continent](http://www.wikidata.org/entity/P30)) (**Item** [South America](http://www.wikidata.org/entity/Q18))))

(**Statement** (**Item** [Argentina](http://www.wikidata.org/entity/Q414)) (**ValueSnak** (**Property** [highest point](http://www.wikidata.org/entity/P610)) (**Item** [wd:Q39739](http://www.wikidata.org/entity/Q39739))))

In [19]:
# match four statements such that:
# - subject "has continent South America" and "official language is Portuguese"
# - value "is a river" or "is a mountain"
it = kb.filter(
    subject=wd.continent(wd.South_America) & wd.official_language(wd.Portuguese),
    value=wd.instance_of(wd.river) | wd.instance_of(wd.mountain),
    limit=4)
for stmt in it:
    display(stmt)

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [located in or next to body of water](http://www.wikidata.org/entity/P206)) (**Item** [wd:Q127892](http://www.wikidata.org/entity/Q127892))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [located in or next to body of water](http://www.wikidata.org/entity/P206)) (**Item** [wd:Q142148](http://www.wikidata.org/entity/Q142148))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [located in or next to body of water](http://www.wikidata.org/entity/P206)) (**Item** [wd:Q3783](http://www.wikidata.org/entity/Q3783))))

(**Statement** (**Item** [Brazil](http://www.wikidata.org/entity/Q155)) (**ValueSnak** (**Property** [highest point](http://www.wikidata.org/entity/P610)) (**Item** [Pico da Neblina](http://www.wikidata.org/entity/Q739484))))

In [20]:
# match four statements such that:
# - subject is "a female" and ("was born in NYC" or "was born in Rio")
# - property is "field of work" or is "equivalent to Schema.org's 'hasOccupation'"

In [21]:
it = kb.filter(
    subject=wd.sex_or_gender(wd.female) & (wd.place_of_birth(wd.New_York_City) | wd.place_of_birth(wd.Rio_de_Janeiro)),
    property=wd.field_of_work | wd.equivalent_property(IRI('https://schema.org/hasOccupation')),
    limit=4)
for stmt in it:
    display(stmt)

(**Statement** (**Item** [wd:Q7212](http://www.wikidata.org/entity/Q7212)) (**ValueSnak** (**Property** [occupation](http://www.wikidata.org/entity/P106)) (**Item** [wd:Q593644](http://www.wikidata.org/entity/Q593644))))

(**Statement** (**Item** [wd:Q7212](http://www.wikidata.org/entity/Q7212)) (**ValueSnak** (**Property** [occupation](http://www.wikidata.org/entity/P106)) (**Item** [wd:Q16533](http://www.wikidata.org/entity/Q16533))))

(**Statement** (**Item** [wd:Q7491](http://www.wikidata.org/entity/Q7491)) (**ValueSnak** (**Property** [occupation](http://www.wikidata.org/entity/P106)) (**Item** [wd:Q593644](http://www.wikidata.org/entity/Q593644))))

(**Statement** (**Item** [wd:Q7491](http://www.wikidata.org/entity/Q7491)) (**ValueSnak** (**Property** [occupation](http://www.wikidata.org/entity/P106)) (**Item** [wd:Q107526461](http://www.wikidata.org/entity/Q107526461))))

## Count and contains

A variant of filter is `kb.count()` which counts the number of statements matching the given restrictions:

In [22]:
kb.count(subject=wd.Brazil, property=wd.population | wd.official_language)

2

Another variant if `kb.contains()` which tests whether a given statement is in `kb`.

In [23]:
stmt1 = wd.official_language(wd.Brazil, wd.Portuguese)
kb.contains(stmt1)

True

In [24]:
stmt2 = wd.official_language(wd.Brazil, wd.Spanish)
kb.contains(stmt2)

False

## Final remarks

This concludes the quickstart guide.

There are many other calls in KIF's Store API.  For more information see, the [API docs](https://ibm.github.io/kif/) and the [examples](https://github.com/IBM/kif/tree/main/examples) dir.