Skip to content

zalepa/patft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PATFT: A USPTO PATFT Parsing Library

PATFT is a simple gem to extract relevant data from raw HTML provided by the USPTO at http://patft.uspto.gov/. PATFT uses Nokogiri and XPath to scan HTML files provided to it and returns a structure (e.g., Hash/JSON) representation of the patent document.

WARNING: PATFT is under active development, refer to the roadmap below (and the specs) to see what is and is not implemented.

Usage

require 'patft'

local_html = File.read('patent.html')
patents = Parser.new(local_html)

patents.extract(:title) # => 'System and method for ...'

Note that PATFT::Parser#parse requires a String representation of the HTML, how you get that is up to you. This was intentional given the USPTO's policy on scraping (and generally to encourage being responsible).

Output Format

Below are the keys output by Parser#parse:

number

A String containing the patent number, without kind code. Note that this field may contain non-numeric characters for design, re-issue, etc. patents.

title

A String containing the title.

Roadmap

Short Term

Extract the following fields:

  • Number
  • Title
  • Issue Date
  • Abstract
  • Inventors*+
  • Assignee*
  • Family ID
  • Serial Number
  • Filing Date
  • US Class*+
  • CPC Class*+
  • Int'l Class*+
  • Field of search
  • Primary Examiner
  • Assistant Examiner
  • Attorney/Agent
  • Parent Case Text
  • Claims*+
  • Description (paragraphs)+
  • Related Patents*+
  • References Cited*+

Format notes:

  • Asterisks denote structured data.
  • Plusses denote arrays of data
  • Asterisks and plusses are arrays of structured data

Medium Term

  • CLI
  • Increase field support based on red book (e.g., PCT data)

Long Term (rough ideas)

  • Remote search interface
  • Query tool ("Advanced Search")
  • AppFT (probably a different gem)

License

The gem is available as open source under the terms of the MIT License.

About

Library for parsing USPTO PATFT data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors