Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a use case. #23

Closed
pudo opened this issue Oct 26, 2012 · 5 comments
Closed

Define a use case. #23

pudo opened this issue Oct 26, 2012 · 5 comments
Labels
Milestone

Comments

@pudo
Copy link

pudo commented Oct 26, 2012

Is this for ckanext-datastorer?

@rufuspollock
Copy link
Contributor

DUPLICATE of #24 ;-)

@pudo
Copy link
Author

pudo commented Oct 26, 2012

Disagree. It's easy to make a long list of specific things that would be enabled by this, but it really needs a fairly simple story to begin with. A narrative (like: "this will import 20 new file types into ckan datastore").

Otherwise it's just technology for technology's sake of the worst kind. (About 50% of the tickets below make nice tools, it's just they don't fit together).

@pudo pudo reopened this Oct 26, 2012
@rufuspollock
Copy link
Contributor

I agree re simple story but still we have 2 duplicate tickets (made almost simultaneously). I don't mind which we close in favour of the other but we should close one :-)

@teajaymars
Copy link

I absolutely agree we need a more concrete definition of where this will fit into the Real World, but that conversation is complicated and involves our Partners. If it's going to fit into CKAN, that will become a user story in #24. If it's going to fit into a wider web service that will also become a user story. A lot of this depends on execution.

Right now it does feel excessively like a piece of isolated what-a-good-idea technology. We get a couple of iterations under our belt and talk to as many users as possible, most of whom seem unable to reason about a service like this in abstract terms. "It doesn't do what I would want" will be our trigger feedback. For now, please let's focus on a single user story issue to avoid clouding the Iteration 1 milestone.

@pudo
Copy link
Author

pudo commented Oct 31, 2012

FWIW, here's my input to @nigelbabu:

Before discussing use cases, just a quick thing on the API: Frankly, I
don't think you should build one. At least not explicitly. If you end
up solving a problem that would benefit from having this API, that's
neat - but I'll guarantee you that whatever problem you choose, it
will not be solved by a general purpose web services data conversion API.

Defining just the API is an overly abstract problem, "data conversion"
is just not a coherent enough problem to justify such an API. If you
really must do it, I would at least fix one side of the equation:
make it generate at most one input OR output format.

Since I've been in touch with RDF people, I have felt that trying to convert lots of
data into a common format is a silly task, since it teases you to just
discard all the format semantics so you get to this unified format.
Basically, you end up making your data less useful. So if you really
must make this API, I would advocate for making lots of formats from a
common one - that seems like a reasonably fixed task.

Anyway, here's some problems that may be interesting to solve:

  • JSON Data Proxy 2.0 / Data Preview - It's nice, as long as you
    define it strictly in those terms. The output is data preview, the
    input is various types of data. This has some fairly specific
    requirements, i.e. it needs to be really fast, work on large data but
    only consider a snapshot of it and it cannot require you to do any
    data modelling beforehand.
  • Data Sync Protocol - Rufus and Max Ogden talked about this one in
    the past, I think it'd be neat to have an implementation of
    http://www.dataprotocols.org/en/latest/sleep.html that works on
    PostgreSQL. Maybe this will then end up as a conversion thing as well,
    but the important thing is to really have a working implementation.
  • Node scraper - re-define "data converters" to include scraping and
    build a really nice tool to do server-side scraping in JavaScript.
    It's needed and would be easy enough to build.
  • Data modeller - build a tool for generating DSPL descriptions of
    data, i.e. something that allows you to make an abstract data model
    and then to generate either normalized versions from denormalized
    input files or vice versa. This would actually be useful, and it is
    something that Google hasn't so far gotten around to building
    themselves. Hard to sell as data converters, though - but a really
    great addition e.g. to CKAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants