Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #33 from sigmavirus24/master
Just a minor pet peeve, fix readme.rst
- Loading branch information
Showing
1 changed file
with
31 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,61 +1,73 @@ | ||
Project Gutenberg Stats | ||
======================= | ||
|
||
Estimated 1.6 million files | ||
Reported 650 GB total | ||
~40,000 + books | ||
|
||
`link to issues`_ | ||
|
||
.. _link to issues: ./issues | ||
.. _link to issues: https://github.com/sethwoodworth/GITenberg/issues | ||
|
||
How are we getting the files? | ||
============================= | ||
rsync -rvhz --progress --partial ftp... | ||
|
||
:: | ||
|
||
rsync -rvhz --progress --partial ftp... | ||
|
||
Each repo should... | ||
=================== | ||
+ metadata.yml | ||
+ author | ||
+ title | ||
+ publishing info | ||
+ provinence | ||
+ book_name.{rst|tei|txt} | ||
+ book text in a master source format | ||
+ license.txt | ||
+ PG license information | ||
+ transcriber, converter credits | ||
+ README.rst | ||
+ generic GITenburg info | ||
+ generic PG info | ||
+ book specific info | ||
+ desc and links to toolchains | ||
+ desc and links to generated versions for ebook readers | ||
|
||
+ metadata.yml | ||
+ author | ||
+ title | ||
+ publishing info | ||
+ provinence | ||
+ book_name.{rst|tei|txt} | ||
+ book text in a master source format | ||
+ license.txt | ||
+ PG license information | ||
+ transcriber, converter credits | ||
+ README.rst | ||
+ generic GITenburg info | ||
+ generic PG info | ||
+ book specific info | ||
+ desc and links to toolchains | ||
+ desc and links to generated versions for ebook readers | ||
|
||
Smart comments: | ||
=============== | ||
|
||
Convert all files to UTF-8 | ||
https://groups.google.com/forum/?fromgroups#!topic/prj-alexandria/VhKbMyH9kcA | ||
|
||
|
||
File formats: | ||
============= | ||
|
||
A list of file formats and their freqency is in the docs folder, generated via: | ||
|
||
:: | ||
|
||
find -type f|rev|cut -d\. -f1|grep -v "/" |rev|sort -f|uniq -c|sort -nr | ||
|
||
.tei | ||
~~~~ | ||
|
||
a master format | ||
http://www.tei-c.org/Tools/Stylesheets/ | ||
http://code.google.com/p/hrit/source/browse/rst2xml-tei.py?repo=tei-rest | ||
|
||
.rst | ||
~~~~ | ||
|
||
a master format | ||
Research toolchain for rst >> whatever | ||
|
||
dp rst manual http://pgrst.pglaf.org/publish/181/181-h.html | ||
|
||
Future | ||
------ | ||
+ http://armypubs.army.mil/doctrine/ | ||
|
||
+ http://armypubs.army.mil/doctrine/ |