Usage

This is a scraping script for extracting the results of the Romanian Baccalaureate from http://bacalaureat.edu.ro. It works for the years 2010 and 2011. It can read from standard input or files. The files can be compressed with gzip, bzip2 or xz.

I wrote it to help a friend who wanted to analyze the results the of the exam. You can read her analysis at the following addresses:

Usage

./main.py --help
usage: main.py [-h] [-o OUTPUT] [--format FORMAT] [--dbtable DBTABLE]
               FILE [FILE ...]

Extrage informații despre elevi din fișiere HTML

positional arguments:
  FILE                  O pagină de pe site-ul edu.ro. Folosiți - pentru
                        stdin.

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Fișierul de ieșire. Implicit e stdout.
  --format FORMAT       Formatul de ieșire. Formate suportate: python, pickle,
                        sqlite. Format implicit: python.
  --dbtable DBTABLE     Numele tabelului din baza de date. Nume implicit:
                        rezultate.

Usage examples

./main.py data/page_21.html
./main.py data/page_21.html{,.gz,.bz2,.xz}
./main.py data/page_18.html data/page_18.html
tar xJfO bac2010_alfabetic_page.tar.xz | ./main.py --format sqlite --output bac.sqlite --dbtable results2010 -

Installation and Requirements

python 2.7
python-lxml
pyliblzma

Fedora 15

yum install python-lxml pyliblzma

Copyright and License

The code is too simple and too ugly to require legal paperwork, so I declare it public domain. Though if you find this useful in any way, I would like you to tell me about it or give me some credit. Thank you!

Credits

This wouldn't have been possible without the Sothink SWF Decompiler. Shame on Siveco for using Flash even if it wasn't really needed.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
bac2010parser		bac2010parser
.gitignore		.gitignore
AUTHORS		AUTHORS
README.markdown		README.markdown
logging.ini		logging.ini
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Usage examples

Installation and Requirements

Fedora 15

Copyright and License

Credits

About

Releases

Packages

Languages

ciupicri/bac-parser.old

Folders and files

Latest commit

History

Repository files navigation

Usage

Usage examples

Installation and Requirements

Fedora 15

Copyright and License

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages