Skip to content

acmeism/RosettaCodeData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RosettaCode Data Project

This git repository contains (almost) all of the code samples available on http://rosettacode.org organized by Language and Task.

Getting the Data

All of the data is in this repository, so you can just run:

git clone https://github.com/acmeism/RosettaCodeData

However...

It's a lot of data!

If you just want the latest data, the quickest thing to do is:

git clone https://github.com/acmeism/RosettaCodeData --single-branch --depth=1

Tools

This repository's data content is created by a Perl program called rosettacode.

You can install it with this command:

cpanm RosettaCode

You can rebuild the data with:

make build

This repository has a bin directory with various tools for working with the data.

  • rcd-api-list-all-langs

    List all the programming language names directly from rosettacode.org

  • rcd-api-list-all-tasks

    List all the programming task names directly from rosettacode.org

  • rcd-new-langs

    List the RosettaCode languages not yet add to Conf

  • rcd-new-tasks

    List the RosettaCode tasks not yet add to Conf

  • rcd-samples-per-lang

    Show the number of code samples per language

  • rcd-samples-per-task

    Show the number of code samples per task

  • rcd-tasks-per-lang

    Show the number of tasks with code samples per language

  • rcd-langs-per-task

    Show the number of languages with code samples per task

To Do

Pull requests welcome!

This project is not a perfect representation of RosettaCode yet. It has a few uncicode issues. It also has to deal with various formatting mistakes in the mediawiki source pages.

  • Fix bugs

  • Correct the 100s of guessed file extensions in Conf/lang.yaml

  • Ability to only fetch cache pages since last pushed data update

  • Support names with non-ascii characters

  • Add more bin tools

  • Address errors reported in rosettacode.log after running make build