### Retriever
* A retriever is a component in LangChain that fetches relevant documents from a data source in response to a user’s query.
* There are multiple types of retrievers
* All retrievers in LangChain are runnables

### Wikipedia Retriever
* Retrieve wiki pages from wikipedia.org into the Document format that is used downstream.

WikipediaRetriever parameters include:

* (optional) lang: default="en". Use it to search in a specific language part of Wikipedia
* (optional) load_max_docs: default=100. Use it to limit number of downloaded documents. It takes time to download all 100 documents, so use a small number for experiments. There is a hard limit of 300 for now.
* (optional) load_all_available_meta: default=False. By default only the most important fields downloaded: Published (date when document was published/last updated), title, Summary. If True, other fields also downloaded.

get_relevant_documents() has one argument, query: free text which used to find documents in Wikipedia

In [2]:
from langchain_community.retrievers import WikipediaRetriever

##### Initialize the retriever (optional: set language and top_k)

In [7]:
retriever = WikipediaRetriever(top_k_results=2, lang="en", load_max_docs=2)

##### Define your query

In [15]:
query = "the geopolitical history of India"

##### Get relevant Wikipedia documents

In [16]:
docs = retriever.invoke(query)

In [17]:
print(docs[0].page_content)

Anatomically modern humans first arrived on the Indian subcontinent between 73,000 and 55,000 years ago. The earliest known human remains in South Asia date to 30,000 years ago. Sedentariness began in South Asia around 7000 BCE; by 4500 BCE, settled life had spread, and gradually evolved into the Indus Valley Civilisation, one of three early cradles of civilisation in the Old World, which flourished between 2500 BCE and 1900 BCE in present-day Pakistan and north-western India. Early in the second millennium BCE, persistent drought caused the population of the Indus Valley to scatter from large urban centres to villages. Indo-Aryan tribes moved into the Punjab from Central Asia in several waves of migration. The Vedic Period of the Vedic people in northern India (1500–500 BCE) was marked by the composition of their extensive collections of hymns (Vedas). The social structure was loosely stratified via the varna system, incorporated into the highly evolved present-day Jāti system. The pa

In [18]:
docs

[Document(metadata={'title': 'History of India', 'summary': "Anatomically modern humans first arrived on the Indian subcontinent between 73,000 and 55,000 years ago. The earliest known human remains in South Asia date to 30,000 years ago. Sedentariness began in South Asia around 7000 BCE; by 4500 BCE, settled life had spread, and gradually evolved into the Indus Valley Civilisation, one of three early cradles of civilisation in the Old World, which flourished between 2500 BCE and 1900 BCE in present-day Pakistan and north-western India. Early in the second millennium BCE, persistent drought caused the population of the Indus Valley to scatter from large urban centres to villages. Indo-Aryan tribes moved into the Punjab from Central Asia in several waves of migration. The Vedic Period of the Vedic people in northern India (1500–500 BCE) was marked by the composition of their extensive collections of hymns (Vedas). The social structure was loosely stratified via the varna system, incorpo

##### Print retrieved content

In [19]:
for i, doc in enumerate(docs):
    print(f"\n--- Result {i+1} ---")
    # truncate for display
    print(f"Content:\n{doc.page_content}...")  


--- Result 1 ---
Content:
Anatomically modern humans first arrived on the Indian subcontinent between 73,000 and 55,000 years ago. The earliest known human remains in South Asia date to 30,000 years ago. Sedentariness began in South Asia around 7000 BCE; by 4500 BCE, settled life had spread, and gradually evolved into the Indus Valley Civilisation, one of three early cradles of civilisation in the Old World, which flourished between 2500 BCE and 1900 BCE in present-day Pakistan and north-western India. Early in the second millennium BCE, persistent drought caused the population of the Indus Valley to scatter from large urban centres to villages. Indo-Aryan tribes moved into the Punjab from Central Asia in several waves of migration. The Vedic Period of the Vedic people in northern India (1500–500 BCE) was marked by the composition of their extensive collections of hymns (Vedas). The social structure was loosely stratified via the varna system, incorporated into the highly evolved pres