Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Making standard XNMT Graph interface #564

Merged
merged 26 commits into from
Jan 31, 2019
Merged

[WIP] Making standard XNMT Graph interface #564

merged 26 commits into from
Jan 31, 2019

Conversation

philip30
Copy link
Contributor

These are something that I am working on now. I want to merge this to avoid further code conflict/redundancy in the future.

Basically, I am working on:

  • Making shift-based RNNG based parser, I already have the reader for CONLL format and working examples. Though the parser is still work in progress
  • Making 1 package for simultaenous translation experiment. This package has its own translator and search strategies.
  • Some fixes to the previous merge.

@philip30 philip30 changed the title Merge some of my working code avoiding further merge conflict Merge some of my working codes to avoid future merge conflict Jan 16, 2019
@msperber
Copy link
Contributor

Hey @philip30 , I was wondering if it would be possible to separate this out into the code that's WIP and the code that's ready to be used and only send a PR for the latter? I think at this stage in the project it would be best to keep the master branch clean from unfinished code as much as possible because it might be confusing and difficult to maintain for others. I understand that you want to avoid merge conflicts of course. I have recently followed a strategy of keeping all my working code in separate files, and only make PRs for my changes to the files that are already in the master branch. That has worked pretty well for me and avoids most merge conflicts.

@philip30
Copy link
Contributor Author

Hey @msperber, thank you for you suggestion. But this is actually working code, and I believe @armatthews is going to implement some RNNG as well? Because I already did that, it'd be best if we make a common ground by merging the working code to master.

@msperber
Copy link
Contributor

Ok, if this is all functional code we should work toward merging it. Based on some quick skimming, a couple of comments:

  • graph structure: I've already implemented a data structure in sent.Lattice which looks very similar. It would be better to merge the two, either by renaming lattice to graph in case they share the same functionality, or by making lattice a subclass of graph or something. In any case, graph should be a subclass of sent.Sentence and implement the interface accordingly.
  • simultaneous translation: I saw that you included both unit and integration tests which is great. But I couldn't find any description on what this feature is actually doing. Most of XNMT is documented quite well by now, could you add docstrings and type annotations similarly here? Also, it would be important to have some high-level explanation, e.g. by writing a description in the test config file, or in the module-level docstring, or some other place in the documentation.
  • RNNG: it would probably make sense for @armatthews to comment on this?

@armatthews
Copy link
Contributor

armatthews commented Jan 17, 2019

I have also already implemented RNNG. See my syntax branch. My implementation is mostly done, and it's trainable, though we can't decode with it yet.

I'm happy to work towards a merge, but I agree with keeping unfinished stuff in the branches.

@philip30
Copy link
Contributor Author

@armatthews I can't see your RNNG in the syntax branch. Maybe you haven't pushed it yet? The latest commit was the part when you read in the syntax tree. It is good that we have two different types of reader, (yours reading the syntax tree while mine reading the CONLL format)

@armatthews
Copy link
Contributor

My RNNG code is only on my own branch. This link should take you directly to the RnngDecoder implementation.

@armatthews
Copy link
Contributor

It looks like your implementation of RNNGs is for dependency trees whereas mine uses the phrase-structure tree formulation, so they may be complimentary, though we might need to clean up the names a bit.

I like that you generalized things by allowing general (hyper-)graphs. That seems like it could unify several of these different syntax (or other!) structures together.

Aside from the RNNG stuff, I see that you added a last_model_scores member to the scorer, which concerns me a bit. In general, I think that we should try to keep objects stateless so that things don't get tricky when operating in parallel or on non-linear structures. Is it possible to move the last_model_scores into the SimultaneousTranslator class, rather than making it part of the scorer?

Having the simultaneous translator model is really nice. It seems you also put a lot of work into improving the RL training, which I hope I can put to good use soon :)

@philip30
Copy link
Contributor Author

@msperber I don't agree with what you said that graph should implement the sentence interface. Here, the graph is just an intermediate representation of the CONLL input (which is list of lines), and the form I need is just a sequence of {shift, nt, reduce} action, a much simpler version of the graph. So I'd implement RNNGActionSentence instead of the GraphSentence.

Also, my graph implementation is based on the list of edges, whereas your lattice implementation is based on direct DAG implementation from object pointing to other objects? I think they are not very compatible, unless we want to turn the lattice implementation into the list of edges implementation.

@philip30
Copy link
Contributor Author

@armatthews I take a look of your implementation and it is very good. Maybe I can try to merge the work together by pulling your branch and work it on my branch and push it here? We can, of course, talk about the way of merging it.

@msperber
Copy link
Contributor

I understand your point @philip30, and also think that it would be good to implement RNNGActionSentence.

The main reason why I think it would be great to unify the data structures is that there are many ways to use graphs in neural models by now, and it would be cool to be able to reuse data structures, input readers, etc. across different places. For example, the currently implemented LatticeLSTM is actually just a TreeLSTM that is bidirectional and happens to support lattices as inputs. By unifying our data structures, we would immediately have support for tree2seq models in XNMT. The TreeLSTM doesn't support hypergraphs, but that could also be added in the future if needed.

It would be perfectly fine to adjust the currently implemented lattice data structure though. The only requirements are:

I don't know much about the requirements on your part, but do you think it would be possible to unify things?

A positive side effect of merging our code would also be that you could plot trees using the code I already implemented: https://github.com/neulab/xnmt/blob/master/xnmt/sent.py#L545

@philip30 philip30 changed the title Merge some of my working codes to avoid future merge conflict [WIP] Making standard XNMT Graph interface Jan 23, 2019
@philip30
Copy link
Contributor Author

Alright, I will be working on it :)

@philip30
Copy link
Contributor Author

@msperber I already merged the Lattice with the graph interface. Can you take a look at the design?
Also, I still have an error in the topological sort part, in the transducers/lattice.py. I am not sure which part is wrong? I guess the error comes when it tries to topo sort the reversed lattice?

But I have made sure that the graph operations are valid (as seen in the unittest). Your help is much appreciated! Thank you

@philip30
Copy link
Contributor Author

I hacked and fixed it :), I think it's important to store the topo_sort results first before using that.

@msperber
Copy link
Contributor

Cool, this looks very nice! So in the end we might have a CoNLLTreeReader that outputs either GraphSentence or RNNGActionSentence objects, and a LatticeReader that outputs LatticeSentence objects?

Actually, it might be possible to get rid of the special LatticeSentence class. I think the main reason for it is to be able to reverse the scores for each node whenever the graph is reversed, but we could simply implement HyperNode.reverse() that returns self by default, or returns a node with reversed probs in the case of LatticeNode. When reversing the graph, it would then simply add a call to reverse() for each node.

BTW I also have some more graph-related code which I want to merge at some point (though probably not within the next couple of weeks), so I'll also take a look at whether things can be further simplified after this PR has been merged.

@philip30
Copy link
Contributor Author

philip30 commented Jan 25, 2019

@msperber Thanks! I have a question.

Actually, it might be possible to get rid of the special LatticeSentence class. I think the main reason for it is to be able to reverse the scores for each node whenever the graph is reversed, but we could simply implement HyperNode.reverse() that returns self by default, or returns a node with reversed probs in the case of LatticeNode. When reversing the graph, it would then simply add a call to reverse() for each node.

Would this be over-simplify things? So you don't care about the direction of the edge? I think it is also a bit strange to have a reverse() operation in a graph...

@msperber
Copy link
Contributor

Hm, no I didn't mean to get rid of the reverse operation, but it's currently implemented in GraphSentence and then overridden in LatticeSentence, and I was wondering if we could get rid of the latter class altogether.

The above was just a suggestion, but I think if you'd prefer it'd also be fine to keep both GraphSentence and LatticeSentence classes for now, as this should already be enough to support encoding trees (in the form of GraphSentence) using the LatticeLSTM transducer.

@philip30
Copy link
Contributor Author

@msperber I get rid of the classes and did revision based on your suggestion. I hoped it looks fine now? After this, I will pull the code from @armatthews repo and try to merge the RNNG works.

@msperber
Copy link
Contributor

Yes, this looks great, thanks!

@philip30
Copy link
Contributor Author

I am merging right now. Currently, there is a bunch of undocumented code in SimultaneousTranslator, yet it is tested. I will soon follow up with documentation when things are less busy.

@philip30 philip30 merged commit 3be7f10 into master Jan 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants