Skip to content

Commit

Permalink
More readme info
Browse files Browse the repository at this point in the history
  • Loading branch information
ggaughan committed Mar 9, 2010
1 parent 1141fa0 commit 52df0b9
Showing 1 changed file with 27 additions and 3 deletions.
30 changes: 27 additions & 3 deletions README
Original file line number Diff line number Diff line change
@@ -1,3 +1,27 @@
This is a proof-of-concept and so far can handle very basic
fetch/filter/output pipelines.

Design
------
The Yahoo pipelines are translated into pipelines of Python generators which
should give a close match to the original data flow. Each call to the final
generator will ripple through the pipeline issuing .next() calls until the
source is exhausted.

The modules are topologically sorted to give their creation order.
The main output and inputs are connected via the yielded values and the
first parameter. Other inputs are passed as named parameters referencing the
input module.

The JSON representation of the configuration parameters maps closely onto
Python dictionaries and so is left as-is and passed and parsed as-and-when
needed.

Each Yahoo module is coded as a separate Python module. This might help in
future if the generators are made to run on separate processors/machines and
we could use queues to plumb them together.


Install the dependencies
------------------------
Universal feedparser (http://www.feedparser.org/):
Expand Down Expand Up @@ -26,8 +50,8 @@ script which wraps the pipeline in a function and can then be imported and
run from another Python program. The other interprets the pipeline on-the-fly
and executes it within the current process.

1. Compiling a pipeline to Python
---------------------------------
1. Compiling a pipeline to a Python script
------------------------------------------
* python compile.py -p pipelineid

or
Expand All @@ -36,7 +60,7 @@ and executes it within the current process.

2. Interpreting a pipeline and executing in-process
---------------------------------------------------
from pipe2py import compile
from pipe2py.compile import parse_and_build_pipe

pipe_def = """json representation of the pipe"""

Expand Down

0 comments on commit 52df0b9

Please sign in to comment.