New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to switch between presentation formats #40
Comments
This is closely related to issue #1. Please see discusion starting here. |
When any sentence is dealt with by the system. This isn't something the user should have to do—the system should automatically just do this for the user..
Yes, exactly. |
Ah, it actually was already converting input from CG to conllx before sending data to the brat visualiser. It is what cgParse in CG2conllx.js does. |
Yes. Until recently it converted it to sdparse, which doesn't support POS tags. Conllx might work as a backend—the main thing is that it be a superset of other formats (like CG, sdparse, etc.). Otherwise we'll probably want to convert CG to conllu instead. |
conllx is not enough as it does not support spans, e.g.
Conllu
|
Ah, good point. There should be a comparison of the several formats somewhere. |
From #26 (comment) :
|
I was stuck on this because I didn't see what architecture should be there to store and handle formats. In this issue, we were discussing that the system should automatically convert all the data to CoNLL-U. But there is of course a problem with ambiguous CG, which can't be converted. Now I came up with the following:
I'm going to implement this behavior now. |
This is good. At some point it should also take the user to the first ambiguous analysis, but this will be tricky to implement. So we can file it as an enhancement and not worry about it for now.
So the corpus can be in either format natively, and you can convert it back and forth trivially. The first part of this is excellent—I just worry some about data loss when converting back and forth is so easy.
So this means you can only edit in the native format of the corpus (or whatever it happens to be after converting), and if you want to edit in the other format, you'd have to convert the whole corpus. While it would be harder to implement, it would be nice to have the |
I don't think there should be a button for Convert corpus to CG3. It isn't something we want to encourage. I agree with @jonorthwash that it would be nicer to have tabs. |
But editing in CG3 is easier than in CoNLL-U, at least if you're used to it. Anyway, this issue now seems to be done! |
Ok. Just a few notes:
I think, #10 is about it. But yeah, it's going to be tricky...
There are 3 things that can be lost when converting from CoNLL-U to CG3:
I don't know how these data can be represented in CG3. This lead me to the conclusion that when the corpus' native format is CoNLL-U, the interface should, when viewed in CG3, store a copy of the ĆoNLL-U sentence with all these data. The same actually should work when the sentence is viewed in plain text.
So, I've already told you vk, but just to conclude: you don't have to convert the whole corpus to be able to edit the tree with gui. you can just edit it in CG, if the current sentence is not ambiguous.
Ok, then I'll remove this button. For editing text in CG3, you can use the
I'll create a new issue for this. |
Hmm, I didn't see the messages on VK... |
ah, @ftyers, it was in the personal chat with Jonathan. I'll resend them to you. |
When I input text in CG, it should be possible to switch to CoNLL-U format when a CG sentence is fully disambiguated. The backend format should probably be CoNLL-U, as it supports slightly more than CG. So you'd take the,
And this would become in the CoNLL-U something like:
You should be able to switch back and forth between the two formats without losing anything, e.g.something like model = backend CoNLL-U and view = CG / CoNLL-U / sdparse.
Note that an undisambiguated sentence could not be converted,
Because you wouldn't know which token ID to give to
sent
. In the case of a sentence which is not fully disambiguated in CG then the link/button/tab for CoNLL-U would be greyed out.The text was updated successfully, but these errors were encountered: