Skip to content
/ STAPI Public

Data and code for LREC 2022 paper "STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents"

Notifications You must be signed in to change notification settings

ZN1010/STAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

STAPI

STAPI (Section Title And Prose text Identifier) is a two-step system for labeling section titles and prose text in HTML documents. Our paper has been accepted for a Poster presentation at LREC 2022.

You can go to Software directory to check the source code of our training pipeline. We also created a web demo here.

About

Data and code for LREC 2022 paper "STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages