Browse files

updates to readme

  • Loading branch information...
1 parent 402eecd commit 93000ab12f014030963c09ec5280d147b095fb81 @emmett9001 emmett9001 committed Jan 28, 2013
Showing with 24 additions and 50 deletions.
  1. +24 −50
@@ -4,41 +4,43 @@ schemato
Meet Mr. Schemato, the friendly semantic web validator and distiller that is
making metadata cool again.
+To contribute or report bugs, contact You can also report
+Github issues if you prefer.
-This is a validator for the a number of embedded metadata standards. It
-works by reading the object ontology and comparing each of a set of
-parsed tuples from a document against this ontology.
-To test the validation, clone this repo, pip install the requirements, and run
- >>> from schemato import Validator
- >>> validator = Validator()
- >>> validator.validate("test_documents/rdf.html")
-this will run a validation on a correctly-implemented RDFa document (rdf.html). To run
-a validation on a document with errors, use one of the error test files
+This is a validator for HTML-embedded metadata standards. It knows the
+location of the official schema definitions, and uses these documents as
+validation templates. As a contributor, you can easily subclass the base
+validator class to plug into this functionality.
-``>>> Schemato("test_documents/schema_errors.html").validate()``
+To see the validator in action:
-The full standard is now also supported. You can validate any page
-that uses this standard against the RDFa ontology hosted at To
-test this, you can find an arbitrary article, or copy and paste
-this example
+ $ git clone
+ $ cd schemato
+ $ pip install -r requirements.txt
+ $ cd schemato
+ $ ipython
+ >>> from schemato import Schemato
+ >>> sc = Schemato("../test_documents/rdf.html", loglevel="INFO")
+ >>> sc.validate()
-``>>> Schemato("").validate()``
+The first time you run schemato, it will make requests for the latest versions
+of the official schema definitions. These files are cached locally with
+a fairly long expiry, to avoid the overhead of web requests. Schemato will
+then call the ``validate()`` method of the Validator subclasses listed in
-The ``test_documents`` directory also includes four documents for testing the validation in RDFa
-and microdata, both with and without errors built in. Running the validator on
-either of the correct files should yield no errors.
+There are a few other test documents available for validation in the
+test\_documents subdirectory.
Schemato's distiller framework lets you implement strategies for creating a "normalized" set of metadata by mixing and matching metadata from different supported standards.
-Supported so far:
+Supported so far:
* parsely-page
* OpenGraph
@@ -99,31 +101,3 @@ In this case, our strategy did not involve parsely-page, and instead used and OpenGraph. Since Mashable does not implement but does
implement OpenGraph, it comes up with the fields it can. The ``sources`` property
shows which fields were populated and how they got their values.
-Hosted Service
-The schemato module is also incorporated into a web service that provides
-a nice frontend for the validation. To test this service locally, run
-``python``. Then navigate to localhost:5000, paste
-a url into the search bar, and click "Validate" to run a validation on the document.
-Running this service locally also requires celery and rabbitmq to be running
-and properly configured. RabbitMQ and celery can be configured to work
-together using the supplied example.schemato\ file. Change its name
-to schemato\ and replace the dummy username, password, and vhost
-fields to the appropriate RabbitMQ settings for your system.
-Simply use ``pip install -r requirements.txt`` to install the dependencies for
-this project. It also requires a local RabbitMQ server, which can be
-downloaded at
-schemato was designed and implemented by Emmett Butler,, Inc.
-parts were contributed by Andrew Montalenti,

0 comments on commit 93000ab

Please sign in to comment.