Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interface for adjusting tokenisation in graph #2

Closed
jonorthwash opened this issue Nov 14, 2016 · 9 comments
Closed

interface for adjusting tokenisation in graph #2

jonorthwash opened this issue Nov 14, 2016 · 9 comments

Comments

@jonorthwash
Copy link
Owner

jonorthwash commented Nov 14, 2016

Make a way to adjust the tokenisation of elements of a UD graph. This should allow you to divide existing tokens, combine existing tokens, reorder existing tokens, split a token into subtokens, insert tokens, and delete tokens, and renumbering should be automatic.

Keyboard shortcuts should be involved, but mouse actions too, suck as clicking and dragging across an element that should be split from another element before hitting the "split" key or something.

@ftyers
Copy link
Collaborator

ftyers commented Nov 14, 2016

so imagine you have

thisisatest.

you click on (t, 0) hold down click and drag to (s, 3) and let off click

this makes:

this isatest.

now you click on (i, 5) and hold down click and drag to (s, 6), which makes

this is atest.

etc. until all the list is properly surface-tokenised.

@jonorthwash
Copy link
Owner Author

But you'd have to enter "tokenise" mode, and then leave it when done. And you'd need a similar way to e.g., combine existing tokens, which has two subtypes: merge tokens, and merge and create subtokens.

@jonorthwash
Copy link
Owner Author

Basic interface:

  • select content (e.g., across tokens), press key combo A to merge/split tokens
  • select content (e.g., within a token), press key combo B to merge/split subtokens/"words"

Will need to include:

  • logic for renumbering; one option:
    • Say nodes 8 and 9 are merged. Renumber all nodes above 9, subtracting one, and keep track in a hash of mapping from old to new. Then go through all targets and renumber according to hash.

@maryszmary
Copy link
Collaborator

At the moment the interface supports splitting a token into new ones. To split a token, click the right button of the mouse on the token node, then (using arrows) go to the place you want to split and insert a space, then press enter. All the attributes default to belong to the first part.

An illustration. Before:
image
After:
image

So far the positioning of the input element is not ideal, I'm going to improve it.

@jonorthwash
Copy link
Owner Author

Nice! Can tokens be merged too, or not yet?

@maryszmary
Copy link
Collaborator

Not yet.

@maryszmary
Copy link
Collaborator

Now they can. Here is how it can be made:
Left click on the main token (the token you want to take the markup from), then press m, then select with an arrow (right or left), which neighbor you want to append to the main token.

An example: left click on the token, then m
image

I press the right arrow which token i want to append
image

@jonorthwash
Copy link
Owner Author

Cool!

@jonorthwash
Copy link
Owner Author

This is the basic functionality for this issue. We'll open a new issue for subtoken-related functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants