- Install Python==3.9.0 (Prefferably)
- Install requirements (by navigating to the folder)
pip install -r requirements.text
- run the app
python app.py
HOST
http://127.0.0.1:8000
(localhost)
endpoint = /get_mcq_questions
-
mode =
['all', 'single', 'interval']
Any one of thesea.
all
: When you want to extract entire text from a PDF b.single
: When you want to extract text from a single page
c.interval
: When you want to extract +-n pages from a target page -
url = A string of
public
url for a PDF file hosted on a google drive -
num_questions = Number of questions you want to be created
-
page_number = Specific to
single
andinterval
format, specifies target page -
interval = Specific to
interval
formatm specifies target interval
{
"url": "https://drive.google.com/file/d/1TGEgTeDQAS2NyS36_KXv1ZyIcA0tFTvr/view?usp=drive_link",
"num_questions": 5,
"mode": "interval",
"page_number": 2,
"interval":2
}
{
"status": true,
"reason": "",
"questions": [
{
"question": "What is the first step in urine formation?",
"options": [
"Reabsorption",
"Secretion",
"Glomerular filtration",
"Ultra filtration"
],
"answer": "Glomerular filtration"
},
{
"question": "What is the average amount of blood filtered by the kidneys per minute?",
"options": [
"500-600 ml",
"700-800 ml",
"900-1000 ml",
"1100-1200 ml"
],
"answer": "1100-1200 ml"
},
{
"question": "Which type of nephron has a loop of Henle that runs deep into the medulla?",
"options": [
"Cortical nephrons",
"Juxta medullary nephrons",
"Glomerular nephrons",
"Peritubular nephrons"
],
"answer": "Juxta medullary nephrons"
},
{
"question": "What are the major forms of nitrogenous wastes excreted by animals?",
"options": [
"Ammonia, urea, and uric acid",
"Carbon dioxide, water, and ions",
"Proteins, fats, and carbohydrates",
"Sodium, potassium, and chloride"
],
"answer": "Ammonia, urea, and uric acid"
},
{
"question": "What is the excretory structure in Platyhelminthes, rotifers, some annelids and the cephalochordate – Amphioxus?",
"options": [
"Nephridia",
"Malpighian tubules",
"Protonephridia or flame cells",
"Antennal glands"
],
"answer": "Protonephridia or flame cells"
}
]
}
endpoint = /chat
-
mode =
['all', 'single', 'interval']
Any one of thesea.
all
: When you want to extract entire text from a PDF b.single
: When you want to extract text from a single page
c.interval
: When you want to extract +-n pages from a target page -
url = A string of
public
url for a PDF file hosted on a google drive -
question = quetion asked by user
-
page_number = Specific to
single
andinterval
format, specifies target page -
interval = Specific to
interval
formatm specifies target interval -
chat_history = A list of tuples with question answer pairs
-
page_number_location = page_number_locations is a list of possible locations in decreasing order of priority.
Supported locations include -
[lc - lower center,
uc - upper center
lr - lower right
ll - lower left
ur - upper right
ul - upper left]
- page_number_style = Represents how the page numbers are written , are they only numeric or a alpha numeric as in John Deere
["alpha_numeric", "only_numeric"]
{
"url": "https://ncert.nic.in/textbook/pdf/kebo116.pdf",
"question": "what is this page about",
"mode": "interval",
"page_number": 2,
"interval":2,
"chat_history": [(question, answer), (question, answer)]
}
{
"error": false,
"message": "Success",
"reason": "",
"response_data": {
"answer": "This page is primarily about the functions and structures of the kidney, with a focus on the processes of filtration, reabsorption, and secretion. It details how the glomerulus filters blood and forms filtrate in the Bowman’s capsule, and how reabsorption of the filtrate takes place in different parts of the nephrons. It also discusses the role of the juxta glomerular apparatus (JGA) in regulating the Glomerular Filtration Rate (GFR). The page further explains the process of dialysis and kidney transplantation as methods for treating kidney failure. It also defines certain kidney conditions such as renal calculi and glomerulonephritis. The page includes a detailed explanation of the function of the tubules, including the Proximal Convoluted Tubule (PCT), Henle’s Loop, Distal Convoluted Tubule (DCT), and Collecting Duct. It also discusses the mechanism of concentration of the filtrate, particularly the role of Henle’s loop and vasa recta in producing a concentrated urine. Lastly, the page includes exercises and questions for further understanding of the topic.",
"guid": 0
},
"status": 200
}
endpoint = /crawl
- url = A string of
public
url for a PDF file hosted on a google drive - depth = A number indicating into how much depth of the webpage you want to go. Depth means a page contains 10 links and depth is 3, it means it will also crawl those 10 links, and links within them.
{
"url": "https://investor.fb.com/financials/",
"depth": 1
}
{
"data": [
"https://www.abc.xyz/assets/d4/4f/a48b94d548d0b2fdc029a95e8c63/2022-alphabet-annual-report.pdf",
"https://ai.google/static/documents/ai-principles-2023-progress-update.pdf",
"https://www.abc.xyz/assets/9a/bd/838c917c4b4ab21f94e84c3c2c65/goog-10-k-q4-2022.pdf",
"https://www.abc.xyz/assets/c4/d3/fb142c0f4a78a278d96ad5597ad9/2022q4-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/b1/d0/c66d744443e698fd63a3ae81e12a/2022q3-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/a7/5b/9e5ae0364b12b4c883f3cf748226/goog-exhibit-99-1-q1-2023-19.pdf",
"https://www.abc.xyz/assets/31/25/fb7b6946475d96b7fa4b9c3e2149/2022q1-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/20/ef/844a05b84b6f9dbf2c3592e7d9c7/2023q2-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/4a/f6/411d938e492e9b66749e2ba1984f/goog-10-q-q2-2023-4.pdf",
"https://www.abc.xyz/assets/c2/3e/0d6d568e4f56a1d14ca6b70c3443/goog-10-q-q3-2023.pdf",
"https://www.abc.xyz/assets/0d/4a/646d28c945aba76a5eeeba68e686/2022q2-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/86/99/68122c444c4a93d2228e21ecc16b/20230426-alphabet-10q.pdf",
"https://www.abc.xyz/assets/fa/0e/606be5234d8c895e5e167d38811f/20220427-alphabet-10q.pdf",
"https://www.abc.xyz/assets/4a/3e/3e08902c4a45b5cf530e267cf818/2023q3-alphabet-earnings-release.pdf",
"https://www.abc.xyz/assets/f2/48/c0bc469747b691dd301e91cb10cc/20220726-alphabet-10q.pdf",
"https://www.abc.xyz/assets/06/a6/2ea9850a4b4584c07fac2c1b517d/20221025-alphabet-10q.pdf"
],
"error": true,
"message": "Sucess",
"reason": ""
}
endpoint = /chat
-
mode =
['all', 'single', 'interval']
Any one of thesea.
all
: When you want to extract entire text from a PDF b.single
: When you want to extract text from a single page
c.interval
: When you want to extract +-n pages from a target page -
url = A string of
public
url for a PDF file hosted on a google drive -
question = quetion asked by user
-
page_number = Specific to
single
andinterval
format, specifies target page -
interval = Specific to
interval
formatm specifies target interval -
chat_history = A list of tuples with question answer pairs
-
page_number_location = page_number_locations is a list of possible locations in decreasing order of priority.
Supported locations include -
[lc - lower center,
uc - upper center
lr - lower right
ll - lower left
ur - upper right
ul - upper left]
- page_number_style = Represents how the page numbers are written , are they only numeric or a alpha numeric as in John Deere
["alpha_numeric", "only_numeric"]
{
"url": "https://ncert.nic.in/textbook/pdf/kebo116.pdf",
"question": "what is this page about",
"mode": "interval",
"page_number": 2,
"interval":2,
"chat_history": [(question, answer), (question, answer)]
}
{
"error": false,
"message": "Success",
"reason": "",
"response_data": {
"answer": "This page is primarily about the functions and structures of the kidney, with a focus on the processes of filtration, reabsorption, and secretion. It details how the glomerulus filters blood and forms filtrate in the Bowman’s capsule, and how reabsorption of the filtrate takes place in different parts of the nephrons. It also discusses the role of the juxta glomerular apparatus (JGA) in regulating the Glomerular Filtration Rate (GFR). The page further explains the process of dialysis and kidney transplantation as methods for treating kidney failure. It also defines certain kidney conditions such as renal calculi and glomerulonephritis. The page includes a detailed explanation of the function of the tubules, including the Proximal Convoluted Tubule (PCT), Henle’s Loop, Distal Convoluted Tubule (DCT), and Collecting Duct. It also discusses the mechanism of concentration of the filtrate, particularly the role of Henle’s loop and vasa recta in producing a concentrated urine. Lastly, the page includes exercises and questions for further understanding of the topic.",
"guid": 0
},
"status": 200
}
endpoint = /create_metadata
This API is to be added from the fronted to the server whenever a new pdf is added, this will create the metadata
- url = A string of
public
url for a PDF file hosted on a google drive - page_number_location = page_number_locations is a list of possible locations in decreasing order of priority.
Supported locations include -
[lc - lower center,
uc - upper center
lr - lower right
ll - lower left
ur - upper right
ul - upper left]
- page_number_style = Represents how the page numbers are written , are they only numeric or a alpha numeric as in John Deere
["alpha_numeric", "only_numeric"]
{
"url": "https://assets.openstax.org/oscms-prodcms/media/documents/ConceptsofBiology-WEB.pdf",
"page_number_location": ["lc", "lr", "ll"],
"page_number_style": "alpha_numeric"
}
{
"error": false,
"file_path": "pdfs/ConceptsofBiology-WEB.pdf",
"json_path": "pdfs\\ConceptsofBiology-WEB.json",
"message": "Success",
"reason": "",
"status": 200
}
To test the test cases first install the pytest
library using the following command.
pip install pytest==8.0.0
Execute the tests by running
pytest tests
Please note that 2 test cases will fail for create_metadata if metadata is already present. If you want all test cases to pass delete all files in the /pdfs
folder