In [2]:
# imports
from dotenv import load_dotenv
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_community.document_loaders import PyMuPDFLoader 
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma

In [6]:
# get access keys
load_dotenv()
file = '../document_loaders/books/scientific_legacy_apollo.pdf'
user_query = "What do we know about lunar cratering rate?"

In [4]:
# instantiate embedding model
embed_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

  from .autonotebook import tqdm as notebook_tqdm


## Loading a pdf document

In [7]:
# load document
loader = PyMuPDFLoader(file)
pages = loader.load()
print(len(pages))
# print(docs[2])

11


In [10]:
print(pages[1].page_content)

12.5 days, with just 3.4 days spent performing extravehicular activities (EVAs) 
outside the lunar modules (Orloff and Harland 2006). Yet during this all-too-brief a 
time samples were collected, measurements made, and instruments deployed which 
have revolutionised lunar and planetary science and which continue to have a major 
scientific impact today.  
 
 
 
 
Fig 2. The Apollo landing sites. 
Note their restriction to the 
central part of the nearside – 
there is a lot more of the Moon 
to explore! (USGS/Dr K.H. 
Joy). 
 
 
 
 
 
 
 
 
 
 
 
 
 
Exploration Efficiency 
 
In their cumulative 12.5 days (25 man-days) on the lunar surface, the twelve Apollo 
moonwalkers traversed a total distance of 95.5 km from their landing sites (heavily 
weighted to the last three missions that were equipped with the Lunar Roving 
Vehicle), collected and returned to Earth 382 kg of rock and soil samples (from over 
2000 discrete sample localities), drilled three geological sample cores to depths 
g

## Text splitting

In [11]:
# split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 100)
split_docs = splitter.split_documents(pages)
print(len(split_docs))
# print(split_docs[-2])

33


In [14]:
print(split_docs[0].page_content)

The Scientific Legacy of Apollo 
 
Ian A. Crawford, Department of Earth and Planetary Sciences, Birkbeck College, 
University of London (i.crawford@bbk.ac.uk).  
 
Article published in the December 2012 issue of the Royal Astronomical Society’s 
journal Astronomy and Geophysics (Vol. 53, pp. 6.24-6.28). 
 
 
Abstract 
On the 40th anniversary of the last human expedition to the Moon, I review the 
scientific legacy of the Apollo programme and argue that science would benefit from 
a human return to the Moon. 
 
Introduction 
 
This December marks 40 years since the last human beings to set foot on the Moon, 
Gene Cernan and Harrison “Jack” Schmitt of Apollo 17, left the lunar surface and 
returned safely to Earth. This anniversary alone would have justified a retrospective 
look at the legacy of the Apollo project, but it has been given additional poignancy by 
the death earlier this year of Neil Armstrong, the first man to set foot on the lunar


In [15]:
print(split_docs[1].page_content)

the death earlier this year of Neil Armstrong, the first man to set foot on the lunar 
surface with Apollo 11 in July 1969. The history of the Apollo project, and its 
geopolitical motivation within the context of the Cold War, is of course well 
documented (e.g. Chaiken 1994; Burrows 1998; Orloff and Harland 2006) and need 
not be repeated here. However, although the scientific legacy of Apollo has also been 
well-documented (e.g. Heiken et al. 1991; Wilhelms 1993; Beattie 2001), and is 
generally well-known within the lunar science community, I have found that it is still 
underappreciated by many astronomers, and even some planetary scientists who are 
not directly involved in lunar studies. That, at any rate, is my justification for taking 
this opportunity to give a brief review of Apollo science. 
 
 
Fig. 1. One of the last two men 
on the Moon: Harrison Schmitt 
stands next to a large boulder at 
the Apollo 17 Station 6 locality 
in December 1972. Note the


## Create Vectorstore

In [16]:
# create vectorstore
vector_store = Chroma.from_documents(documents=split_docs, embedding=embed_model, collection_name="apollo", persist_directory= 'chroma')

In [23]:
vector_store.get().keys()

dict_keys(['ids', 'embeddings', 'documents', 'uris', 'included', 'data', 'metadatas'])

In [31]:
# searching relevant documents
# do vanilla semantic search
similarity_result = vector_store.similarity_search(user_query, k =3)
print(len(similarity_result))


3


In [33]:
print(similarity_result[0].page_content)

Probably the most important result based on the Apollo material has been the 
calibration of the lunar cratering rate, especially over the period 3.2 to 3.8 billion 
years ago covered by the Apollo samples (reviewed by Stöffler et al. 2006). Only by 
comparing the density of impact craters on surfaces whose ages have been obtained 
independently by laboratory radiometric analyses of returned samples is it possible to 
obtain a calibration of the cratering rate. Analysis of the Apollo samples 
(supplemented by those obtained by the Soviet Union’s Luna robotic missions) has 
enabled this to be done for the Moon (Fig. 4), which remains the only planetary body 
for which such a calibration exists. Not only has this facilitated the dating of lunar 
surfaces from which samples have yet to be obtained, but it is used, with assumptions, 
to estimate the ages of cratered surfaces throughout the Solar System from Mercury to


In [34]:
print(similarity_result[1].page_content)

to estimate the ages of cratered surfaces throughout the Solar System from Mercury to 
the moons of the outer planets. In particular, until such time as samples are returned 
from Mars (an important, but apparently ever-receding, scientific goal of future 
exploration), extrapolations of the Apollo calibration of the lunar cratering rate 
remains the only way of dating key events in the history of that planet, including those 
related to past habitability (Kallenbach et al. 2001). Arguably, this alone would justify 
the Apollo missions from a scientific point of view. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Fig 4. The lunar crater density (number of craters larger than 1 km in diameter per square km) 
as a function of surface age as calibrated by Apollo (A) and Luna (L) samples (modified from 
Stöffler et al., 2006; reproduced with permission of the Mineralogical Society of America).


## Retriever

In [None]:
# use retriever
retriever = vector_store.as_retriever(search_kwargs = {'k':3})
retriever_result = retriever.invoke(user_query)
print(len(retriever_result))
for i in retriever_result:
    print(i.page_content, end = "\n\n")