The Public Library of Science publishes peer reviewed scientific research under the Creative Commons license. Under this license all research articles are freely available to the scientific community and the general public. The obvious advantage of freely accessible scientific literature is accelerated information dissemination. If quality is maintained through peer review we have a win win for science. There are some less obvious advantages that are virtually impossible to address in a closed access world.
Scientific articles are the quanta of an evolving scientific conversation. By placing these quanta in silos (closed journals) we limit our ability to discover and ultimately participate in the conversation. The problem is compounded in that our primary tool for information discovery (the computer) has limited access to these silos, leaving the hapless researcher to the whims of the journal provider.
There are several fields of research (i.e. Natural Language Processing, Machine Learning etc.) that could benefit from having articles available as data. Having articles in silos inhibits the development of software tools that help scientist or the public discover information in a timely fashion.
This project is an attempt to build a body of code to address these issues. To get started see README.