Permalink
Browse files

updates to readme

  • Loading branch information...
1 parent 402eecd commit 93000ab12f014030963c09ec5280d147b095fb81 @emmett9001 emmett9001 committed Jan 28, 2013
Showing with 24 additions and 50 deletions.
  1. +24 −50 README.md
View
@@ -4,41 +4,43 @@ schemato
Meet Mr. Schemato, the friendly semantic web validator and distiller that is
making metadata cool again.
+To contribute or report bugs, contact emmett@parsely.com. You can also report
+Github issues if you prefer.
+
Validator
---------
-This is a validator for the a number of embedded metadata standards. It
-works by reading the object ontology and comparing each of a set of
-parsed tuples from a document against this ontology.
-
-To test the validation, clone this repo, pip install the requirements, and run
-
- >>> from schemato import Validator
- >>> validator = Validator()
- >>> validator.validate("test_documents/rdf.html")
-
-this will run a validation on a correctly-implemented RDFa document (rdf.html). To run
-a validation on a document with errors, use one of the error test files
+This is a validator for HTML-embedded metadata standards. It knows the
+location of the official schema definitions, and uses these documents as
+validation templates. As a contributor, you can easily subclass the base
+validator class to plug into this functionality.
-``>>> Schemato("test_documents/schema_errors.html").validate()``
+To see the validator in action:
-The full schema.org standard is now also supported. You can validate any page
-that uses this standard against the RDFa ontology hosted at schema.org. To
-test this, you can find an arbitrary nytimes.com article, or copy and paste
-this example
+ $ git clone https://github.com/Parsely/schemato.git
+ $ cd schemato
+ $ pip install -r requirements.txt
+ $ cd schemato
+ $ ipython
+ >>> from schemato import Schemato
+ >>> sc = Schemato("../test_documents/rdf.html", loglevel="INFO")
+ >>> sc.validate()
-``>>> Schemato("http://www.nytimes.com/2012/07/19/world/middleeast/.....html").validate()``
+The first time you run schemato, it will make requests for the latest versions
+of the official schema definitions. These files are cached locally with
+a fairly long expiry, to avoid the overhead of web requests. Schemato will
+then call the ``validate()`` method of the Validator subclasses listed in
+settings.py.
-The ``test_documents`` directory also includes four documents for testing the validation in RDFa
-and microdata, both with and without errors built in. Running the validator on
-either of the correct files should yield no errors.
+There are a few other test documents available for validation in the
+test\_documents subdirectory.
Distiller
---------
Schemato's distiller framework lets you implement strategies for creating a "normalized" set of metadata by mixing and matching metadata from different supported standards.
-Supported so far:
+Supported so far:
* parsely-page
* OpenGraph
@@ -99,31 +101,3 @@ In this case, our strategy did not involve parsely-page, and instead used
Schema.org and OpenGraph. Since Mashable does not implement Schema.org but does
implement OpenGraph, it comes up with the fields it can. The ``sources`` property
shows which fields were populated and how they got their values.
-
-Hosted Service
---------------
-
-The schemato module is also incorporated into a web service that provides
-a nice frontend for the validation. To test this service locally, run
-``python schemato_web.py``. Then navigate to localhost:5000, paste
-a url into the search bar, and click "Validate" to run a validation on the document.
-
-Running this service locally also requires celery and rabbitmq to be running
-and properly configured. RabbitMQ and celery can be configured to work
-together using the supplied example.schemato\_config.py file. Change its name
-to schemato\_config.py and replace the dummy username, password, and vhost
-fields to the appropriate RabbitMQ settings for your system.
-
-Requirements
-------------
-
-Simply use ``pip install -r requirements.txt`` to install the dependencies for
-this project. It also requires a local RabbitMQ server, which can be
-downloaded at http://www.rabbitmq.com/download.html
-
-Authors
--------
-
-schemato was designed and implemented by Emmett Butler, Parse.ly, Inc.
-
-parts were contributed by Andrew Montalenti, Parse.ly

0 comments on commit 93000ab

Please sign in to comment.