URIs, indexes and RSS feeds of the largest 72 closed-source publishers... and the code to scrape them.
Ruby JavaScript
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
app
config
db
doc
lib
public
script
test
vendor/plugins
.gitignore
.rvmrc
Gemfile
Gemfile.lock
README.md
Rakefile
config.ru
email.rb
google_publisher_names.rb
google_publisher_names_output.json
google_search_mechanize.rb
main_thread.rb
main_thread_output.json
opsci_notes.rb
smtp_tls.rb
template.rb
tester.rb
threaded_google_output.json

README.md

OpSci_scrapers

  _____________________                         ______________                      ______
 /\                    \                       /\             \                    /\     \
/  \      _________     \                     /  \      _______\                  /  \_____\
\   \     \       /\     \  __________________\__ \     \______/_____  ___________\__/ ____/_
 \   \     \_____/  \     \/\       ________     \ \                 \/\             \/\     \
  \   \     \    \   \     \ \      \      /\     \ \____________     \ \     ________\ \     \
   \   \     \____\___\     \ \      \____/  \     \/ ___________\     \ \    \_______/_ \     \
    \   \                    \ \      \___\___\     \/\                 \ \             \ \     \
     \   \____________________\ \       _____________\ \_________________\ \_____________\ \_____\
      \  /                    /  \      \            / /                 / /             / /     /
       \/____________________/\   \      \__________/\/_________________/\/_____________/\/_____/
                               \   \      \
                                \   \      \
                                 \   \______\
                                  \  /      /
                                   \/______/

Overview

URIs, indexes and RSS feeds of the largest 72 closed-source publishers... and the code to scrape them.

Proof

https://docs.google.com/spreadsheet/ccc?key=0AtOqyz8P_fJ0dHNKUmh4UGxsa1hVdXBKVmd3Zy0yc3c

Usage

  • $ cd OpSci/
  • $ bundle install
  • $ cd lib/scrapers
  • $ ruby elsevier.rb

Credit

Appreciation given to the minds in ##hplusroadmap. W/o you guys, I'd have no friends =]

Publishers

  • american association for the advancement of science
  • american chemical society
  • royal chemical society
  • amerian geophysial union
  • american institute of physics
  • american psychological association
  • annual reviews
  • association for computing machinery
  • association for symbolic logic
  • begell house
  • bentham sciencec
  • berghahn books
  • biomed central
  • bmj group
  • brill publishers
  • british ecological soiety
  • cambridge university press
  • cell press
  • cold spring harbor laboratory press
  • csiro publishing
  • edinburgh university press
  • edp sciences
  • elsevier
  • european mathematical society publishing house
  • harvard university press
  • hindawi publishing
  • ieee
  • indiana university press
  • informa healthcare
  • informs
  • ingenta connect
  • international press
  • iop publishing
  • ios press
  • japan society of applied physics
  • john benjamins
  • johns hopkins university press
  • karger
  • landes bioscience
  • lippincott williams & wilkins
  • maney publishing
  • mary ann liebert
  • mathematical sciences publishers
  • medknow publications
  • mit press
  • multidisciplinary digital publishing institute
  • national research university higher school of economics
  • nature publishing group
  • nauka
  • nrc research press
  • optical society of america
  • oxford university press
  • penn state university press
  • philosophy documentation center
  • polish academy of sciences
  • royal society of chemistry
  • sage
  • siam
  • Société Mathématique de France
  • springer
  • taylor & francis
  • thieme
  • Universitetsforlaget
  • university of california press
  • university of chicago press
  • university of illinois press
  • walter de gruyter
  • wiley online library
  • wolters kluwer
  • world scientific