SEC Parser - A Lightweight SEC Financial Statement Data Sets Poller and Parser
Program that fetches Financial Statement Data Sets, parses them and persists company fundamentals in a relational database so that it can be used for analysis.
Because the data sets can be quite large, the parser persist the PRE, SUB and NUM files to a relation database for processing in order to avoid eating up all the memory and the causing of heap errors. It then goes on extracting the most important information, such as REVENUES, LIABILITIES, etc for each company and persists it in the FUNDAMENTALS table. The database schema is also provided with this source code (
The SEC also provides a documentation that describes the schema of the datasets.
The package also includes a Price Poller that can be used to fetch the closing price for each ticker symbol in the database once the financial statement data sets have been processed. The prices are being fetched from nasdaq.com. The database schema is also provided with this source code (
How to configure
The Fillings Poller as well as the Price Poller have a separate
.properties file located under
/src/main/resources/price/connector.properties respectively. There you can specify database connection properties.
For the Fillings Poller you also have to specify an
ErrorDir. These directories serve as an intermediate bucket for the data set files being downloaded. That is, the poller fetches a data set file and places it temporarily under
InputDir. Once it is done processing the data set file it moves it to
ErrorDir in case an error has occured while processing.
How to run
$ mvn install $ java -jar sec-connector-1.0-fillings.jar
The Price Poller can be executed as follows:
$ java -jar sec-connector-1.0-price.jar