filter endpoint #93

jnehring · 2015-11-20T10:57:44Z

We could implement an endpoint /toolbox/filter/{filter-id}. Each filter is a SPARQL query. When you submit RDF data to the filter, then the SPARQL query is executed on the RDF and the result is returned to the user. So the general idea is to filter out some triples out of NIF.

This can have two applications

a) Make the output of an e-Service easier. When someone says "I dont want to see NIF, I just want to get all entities in a text" then we can create a filter that extracts the entities from NIF. Then the user can create a pipeline that calls e-Entity first to extract entities. Then it calls the filter to drop all other information in the NIF response apart from the entities.
b) Long pipelines might suffer from the problem that they produce more and more NIF and therefore they process a lot of data. Also some unnecessary data processing might occur. Using the filter you can filter out some data inside of the pipeline.

This might be a feature of FREME 0.5. It might make #89 obsolete.

The text was updated successfully, but these errors were encountered:

x-fran · 2015-11-23T13:16:12Z

I like the first solution you've proposed. But we need to talk about this one too.
Personally I'd like to see what others have to say before anything.

koidl · 2015-11-25T16:10:26Z

@jnehring I am afraid I don't get it. Is your solution proposing that we first receive the NIF based return and then send that return back to a filter which then returns the delta that we need?

fsasaki · 2015-11-25T16:16:05Z

No. The proposed solution is that you send your content to FREME and then
specify what filter you want to be applied before getting the result. No
need to send things twice.

2015-11-25 17:10 GMT+01:00 Kevin Koidl notifications@github.com:

@jnehring https://github.com/jnehring I am afraid I don't get it. Is
your solution proposing that we first receive the NIF based return and then
send that return back to a filter which then returns the delta that we need?

—
Reply to this email directly or view it on GitHub
#93 (comment)
.

fsasaki · 2015-11-25T16:16:19Z

P.S.: you could send things twice, but that is not needed.

2015-11-25 17:16 GMT+01:00 Felix Sasaki felix.sasaki@googlemail.com:

No. The proposed solution is that you send your content to FREME and then
specify what filter you want to be applied before getting the result. No
need to send things twice.

2015-11-25 17:10 GMT+01:00 Kevin Koidl notifications@github.com:

@jnehring https://github.com/jnehring I am afraid I don't get it. Is
your solution proposing that we first receive the NIF based return and then
send that return back to a filter which then returns the delta that we need?

—
Reply to this email directly or view it on GitHub
#93 (comment)
.

jnehring · 2015-11-26T09:01:31Z

I have two ideas how the workflow of using this could look like. Here an example for the WRIPL use case of e-Terminology:

First idea: With pipelines

You name the subset of information in NIF that you are interested. E.g. "return only term URIs and remove all the rest"
We create a filter that extracts your desired information.
We or you create a pipeline that first calls e-Terminology and then the filter.
You do not call e-Terminology directly but you call the pipeline. You submit your text and get only term URIs back without all the other information. I am not sure yet about the output format. I think it will still be JSON-LD but with a very simple structure and without NIF.

2nd idea: With extra parameter

Step 1) and 2) same as above
3) You attach to the request you send to e-Terminology the parameter "filter=terms-only". "terms-only" is the id of the filter that we created in step 2). There is no pipeline involved. Also in this case you dont use the /toolbox/filter endpoint.

Actually I think that the 2nd approach is easier for the users. There is a little more implementation work but it saves us the work of creating the pipelines.

fsasaki · 2015-11-26T10:23:54Z

Sounds good.

2015-11-26 10:01 GMT+01:00 Jan Nehring notifications@github.com:

I have two ideas how the workflow of using this could look like. Here an
example for the WRIPL use case of e-Terminology:

First idea: With pipelines

You name the subset of information in NIF that you are interested. E.g.
"return only term URIs and remove all the rest"

We create a filter that extracts your desired information.

We or you create a pipeline that first calls e-Terminology and then the
filter.

You do not call e-Terminology directly but you call the pipeline. You
submit your text and get only term URIs back without all the other
information. I am not sure yet about the output format. I think it will
still be JSON-LD but with a very simple structure and without NIF.

2nd idea: With extra parameter

Step 1) and 2) same as above
3) You attach to the request you send to e-Terminology the parameter
"filter=terms-only". "terms-only" is the id of the filter that we created
in step 2). There is no pipeline involved. Also in this case you dont use
the /toolbox/filter endpoint.

Actually I think that the 2nd approach is easier for the users. There is a
little more implementation work but it saves us the work of creating the
pipelines.

—
Reply to this email directly or view it on GitHub
#93 (comment)
.

koidl · 2015-11-26T10:39:15Z

@jnehring that sounds great! Some minor questions/comments

Can we combine separate services? For example use a filter to get 'terms-only' (in e-entity the entity labels and in e-terminology the terms) in one return? Separate calls and filters for each service is fine I just wonder if its easier to have one filter option for both services? (Just an idea not sure if it makes sense)
Its essential that the output format is JSON-LD or related. If a developer has to study a documentation to understand the output we have basically failed.

fsasaki · 2015-11-26T10:49:24Z

+1 to 2) from Kevin. Also, I see that people want to have simple output
after a pipelines of services. That could be built in for the individual
last service. E.g. looking at
http://api.freme-project.eu/doc/0.4/tutorials/translate_EN-NL_including_terminology.html
one could say after e-Translation via a dedicated query: give me the terms
in the source and target language.
about the output: there is a simple standardized result format for sparql
http://www.w3.org/TR/sparql11-results-json/
not json-ld but json and very regular. Kevin, could you provide a few
example queries? We then could look into how sparql queries and the results
would look like.

Best,

Felix

2015-11-26 11:39 GMT+01:00 Kevin Koidl notifications@github.com:

@jnehring https://github.com/jnehring that sounds great! Some minor
questions/comments

Can we combine separate service? For example use a filter to get
'terms-only' (in e-entity the entity labels and in e-terminology the terms)
in one return? Separate calls and filters for each service is fine I just
wonder if its easier to have one filter option for both services? (Just an
idea not sure if it makes sense)
2.

Its essential that the output format is JSON-LD or related. If a
developer has to study a documentation to understand the output we have
basically failed.

—
Reply to this email directly or view it on GitHub
#93 (comment)
.

koidl · 2015-11-26T11:34:18Z

@fsasaki

That result looks clean. @Xfran what do you think?

In relation to the queries we 'simply' need the entity label, the entity reference (URL) and the Entity Type to start with. The Entity Type is an interesting one to us because it allows us to filter out for example "Type:Thing". I am not sure however how to deal with the datasets in future. The Entity Type relates to the dbpedia dataset. Things that might trigger questions (or which we should discuss now) are:

Adding relevance
Dealing with returns from datasets that have no entity type (such as the finance taxonomy). However this might not be a problem because we wont need entity types if we have a domain specific taxonomy.

Is this helpful. Or will I try to construct a query example...?

Kevin

fsasaki · 2015-11-26T11:43:19Z

That's helpful, thanks. I'll create a query example by next week or before.

koidl · 2015-11-26T12:28:46Z

Thanks @fsasaki looking forward to it

jnehring · 2015-11-26T12:48:05Z

Can we combine separate services? For example use a filter to get 'terms-only' (in e-entity the entity labels and in e-terminology the terms) in one return? Separate calls and filters for each service is fine I just wonder if its easier to have one filter option for both services? (Just an idea not sure if it makes sense)

It should be possible to create a pipeline that calls 1) e-eEntity, 2) calls e-Terminology and 3) applies the filter. A dedicated filter needs to be created for this.

Its essential that the output format is JSON-LD or related. If a developer has to study a documentation to understand the output we have basically failed.

I agree. The output of the filter is RDF so it can be JSON-LD.

fsasaki · 2015-11-26T13:16:49Z

I agree. The output of the filter is RDF so it can be JSON-LD.

The output of a sparql query would be RDF result format which can be stored as json or in a different syntax, but it won't be jsonl-ld.

jnehring · 2016-01-08T08:34:21Z

Implemented. See documentation in http://api-dev.freme-project.eu/doc/knowledge-base/filtering.html

jnehring added the Specification label Nov 20, 2015

jnehring added this to the FREME 0.5 milestone Nov 20, 2015

jnehring mentioned this issue Nov 20, 2015

Easy output format for e-Services #89

Closed

jnehring mentioned this issue Nov 25, 2015

Processing select parts of an RDF graph with FREME freme-project/e-Entity#63

Open

jnehring closed this as completed Jan 8, 2016

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filter endpoint #93

filter endpoint #93

jnehring commented Nov 20, 2015

x-fran commented Nov 23, 2015

koidl commented Nov 25, 2015

fsasaki commented Nov 25, 2015

fsasaki commented Nov 25, 2015

jnehring commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

jnehring commented Nov 26, 2015

fsasaki commented Nov 26, 2015

jnehring commented Jan 8, 2016

filter endpoint #93

filter endpoint #93

Comments

jnehring commented Nov 20, 2015

x-fran commented Nov 23, 2015

koidl commented Nov 25, 2015

fsasaki commented Nov 25, 2015

fsasaki commented Nov 25, 2015

jnehring commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

fsasaki commented Nov 26, 2015

koidl commented Nov 26, 2015

jnehring commented Nov 26, 2015

fsasaki commented Nov 26, 2015

jnehring commented Jan 8, 2016