Search http://www.copyrightevidence.org and more.
Index content of copyright wiki into elasticsearch. Build a simple API and search frontend.
More data sources:
- list of PDF uploads in the wiki
- links from externallinks in the database dump
Pages, with studies:
$ find . -name "*html" | grep -v "action" | grep -v "Special:" | grep -v "User:" | grep -E '\([0-9]*\)' | grep -v "title="
Access API.
Oh my. http://stackoverflow.com/a/1625291/89391
I don't think it is possible using the API to get just the text.
Basic data from wiki. Links to external PDFs.
Formalize related pages in wiki.