-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce views? #3
Comments
Views are nice, but using them inside of TTK is probably a case of over-engineering. Tarsqi creates documents according to a certain pipeline and that's it. There are no views needed for that. We may want to add several components that add EVENTS, like there are several components that add TLINKS, using a source attribute to keep track of what component added a tag would be enough. Let's focus on flexibility in taking several kinds of input and adjust the pipeline accordingly. For example, taking YTEX output (in TTK format) could be useful even if we just use the tokenization, tagging and lemmatization of YTEX. May want to spend some time on creating a ytex --source option which loads some tags into the tarsqi_tags. |
Here is a potential advantage of having views. Currently, you can run a pipeline with a preprocessor and save the results as a ttk file. You can then run a pipeline with Evita. But say you ran the second pipeline with the preprocessor as well. In that case, if you have views you would have Evita select one of the views and nothing bad happens, except that in the end you have two views with preprocessor data. But currently you get a document with duplicate sentences and chunks (somehow tokens do not get duplicated) and this results in weird TarsqiTree instances that break Evita. |
I can see the sense of having views,given this duplication problem. I guess the question is whether having views is the easiest way to solve that. |
Using views is definitely a more scalable solution. I will look a bit more into how much coding and added complexity it would actually take. |
Is there a case for adding views to Tarsqi? A view would contain some set of tags and will be totally separate form other views. We could have a view for Evita events and one for events taken from another component.
Not sure if this is worth the trouble. An alternative is to enforce that each tag has a source attribute that stores what component created the tag.
The text was updated successfully, but these errors were encountered: