Skip to content

michael153/autociter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

249 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autociter

Authors: Michael Wan, Balaji Veeramani

Overview

Uses NLP to accurately extract citation information from any online website

Dependencies

  • dateparser
  • html2text
  • keras
  • numpy
  • PyPDF2
  • scikit-learn
  • termcolor
  • timeoutdecorator
  • tensorflow
  • fake-useragent pip install dateparser html2text keras PyPDF2 termcolor

Open-Ended Questions Regarding Implementation / ML Model

  • Would preserving capitalization help the model? (E.g names usually are capitalized or all-caps, titles are usually capitalized)

G-cloud Compute Engine (Credentials needed)

SSH onto Instance: gcloud compute --project "autocitertraining" ssh --zone "us-west1-a" "overpowered-autociter"

SCP Files to Instance: gcloud compute scp --recurse * overpowered-autociter:~/[$PWD]

SCP Remote Instance Files to Local: gcloud compute scp --recurse overpowered-autociter:~/[$PWD]/assets/files/ml assets/files/ml

To-do

Project Guideline Doc

Tasklist Spreadsheet

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors