Skip to content

dchandak99/Parsers

Repository files navigation

Parsers

XML(from pdf) to txt breakdown

Soup_final.ipynb will take a directory of XML files (parsed pdfs using GROBID) and will make a folder for each file with the files in the folder corresponding to different sections of the pdf paper.

Tools used: Beautiful Soup

Output Data: used in https://github.com/vmm221313/LongSumm

Releases

No releases published

Packages

No packages published