Skip to content
This repository has been archived by the owner on Jul 2, 2020. It is now read-only.

Commit

Permalink
removed TODO note about "re-run" of Datanest harvester being able to get
Browse files Browse the repository at this point in the history
only the updates
note : Since 68fb71f updates are
working roughly possible. Main limitation is Datanest's API itself (no
Last-Modified or anything similar). Main improvements will come from
addition of Jackrabbit.
  • Loading branch information
hanecak committed May 10, 2012
1 parent ea20af9 commit b6ac02b
Showing 1 changed file with 3 additions and 7 deletions.
10 changes: 3 additions & 7 deletions TODO
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,11 @@ c) find out the best value for 'datanest.organizations.batch_size', "best"
takes around 6-8 hourse to add all items which slows down development and
testing

d) enhancing the Datanest CSV harvesters to be able to "re-run"
and process only differences - i.e. we need to make them able to
keep ODN copy of data up-to-date with Datanest "original"

e) adding more harvesters: direct harvesting of ORSR to get more data
d) adding more harvesters: direct harvesting of ORSR to get more data
about companies, direct harvesting of procurement portals to have
a shot by getting more data from the scanned documents themselves, ...

f) transform the existing application into something like container
e) transform the existing application into something like container
or whatever (maybe even using OSGI) so that it loads and runs
Harvester and other components (as sort of plug-ins)

Expand All @@ -41,4 +37,4 @@ f) transform the existing application into something like container
not need certain harvesters and APIs so there would be a way to
configure and deploy ODN to meet that criteria)

g) ...
f) ...

0 comments on commit b6ac02b

Please sign in to comment.