Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement functions for segmentation #36

Merged
merged 14 commits into from
Aug 12, 2022
Merged

Implement functions for segmentation #36

merged 14 commits into from
Aug 12, 2022

Conversation

Adda0
Copy link
Collaborator

@Adda0 Adda0 commented Aug 1, 2022

This PR implements functions used in segmentation algorithms. Furthermore, this PR contains implementation of helper functions and generally useful member functions on class (struct) Mata::Nfa::Nfa.

All implemented helper functions are required by segmentation algorithms, but are useful for operations on NFAs.

This PR:

  1. Fixes OrdVector implementation
    • Fixes a bug with remove method of OrdVector
    • Implements intersection method on OrdVectors
  2. Creates test file for OrdVector implementation
  3. Reflects changes to methods to make states initial and final
    • Accordingly modifies trans_size() method
  4. Reflects bug fix from Noodler implementation of method to get the shortest words in its Mata implementation
  5. Adds methods to
    • Trim redundant states (one of the requested functions in Missing automata operations #25)
    • Compute reachability of states (both forward and backward reachability)
    • Execute segmentation by a given symbol
    • Get directed graph from an NFA
    • Remove transitions from Nfa
    • Get sequence of transitions from Nfa

A new namespace inside Mata::Nfa named Mata::Nfa::SegNfa is introduced, which includes methods applicable only on segment automata, NFAs whose state space can be split into several segments connected by ε-transitions in a chain. No other ε-transitions are allowed. As a consequence, no ε-transitions can appear in a cycle.

The original method Mata::Nfa::Nfa::trans_size() returned a number of states instead of the number of transitions. The method was fixed to truly return the number of transitions in an automaton. Furthermore, the method was renamed to get_num_of_trans() to follow the naming structure of an already existing method get_num_of_states() and not introduce a new structure for similar methods. For that reason, Python binding for the original trans_size() was updated as well to call the new get_num_of_trans() method instead.

All introduced methods are documented and there are basic tests written for all of them.

Copy link
Member

@martinhruska martinhruska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problems with pull request, it is suitable for merge. Please check it too @tfiedor .

@Adda0 Adda0 mentioned this pull request Aug 11, 2022
Copy link
Member

@tfiedor tfiedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the PR. It's well done, though I have some minor issues, that I would suggest to consider:

  1. Some of the names could be enhanced (also I would suggest to rename set_initial and set_final to reset_initial, since you are clearing the states, which is captured in the action reset, but not in set)
  2. On several places you use *this = ..., which seems to be dangerous. Maybe it is OK, but I would suggest to at least check, that this is OK to be done in C++. Since you are calling this from the method, I feel it could lead to some dangling pointers, memory corruption or some other unexpected behaviour.

include/mata/nfa.hh Show resolved Hide resolved
include/mata/nfa.hh Outdated Show resolved Hide resolved
include/mata/nfa.hh Show resolved Hide resolved
src/nfa/nfa.cc Show resolved Hide resolved
src/nfa/nfa.cc Show resolved Hide resolved
src/nfa/nfa.cc Show resolved Hide resolved
@Adda0
Copy link
Collaborator Author

Adda0 commented Aug 11, 2022

Thank you for the reviews.

Some of the names could be enhanced (also I would suggest to rename set_initial and set_final to reset_initial, since you are clearing the states, which is captured in the action reset, but not in set)

This is a great idea. I will apply the suggested changes.

On several places you use *this = ..., which seems to be dangerous. Maybe it is OK, but I would suggest to at least check, that this is OK to be done in C++. Since you are calling this from the method, I feel it could lead to some dangling pointers, memory corruption or some other unexpected behaviour.

I have seen this structure used somewhere. Nevertheless, I will do a proper research about this issue. If it reveals to be a problem (with the suggested implications), I will update the underlaying member variables instead (set of initial or final states and transition relation).

@tfiedor
Copy link
Member

tfiedor commented Aug 11, 2022

On several places you use *this = ..., which seems to be dangerous. Maybe it is OK, but I would suggest to at least check, that this is OK to be done in C++. Since you are calling this from the method, I feel it could lead to some dangling pointers, memory corruption or some other unexpected behaviour.

I have seen this structure used somewhere. Nevertheless, I will do a proper research about this issue. If it reveals to be a problem (with the suggested implications), I will update the underlaying member variables instead (set of initial or final states and transition relation).

As I said, it is just a hunch, that it could be dangerous, maybe Ondra knows more about this. So, I would maybe investigate a little about this, to avoid potential issues in future. I used to program in C++ a lot and didn't encountered this gem. I kind of like it, since it's basically a defilement of the language :-D.

@Adda0
Copy link
Collaborator Author

Adda0 commented Aug 11, 2022

On several places you use *this = ..., which seems to be dangerous. Maybe it is OK, but I would suggest to at least check, that this is OK to be done in C++. Since you are calling this from the method, I feel it could lead to some dangling pointers, memory corruption or some other unexpected behaviour.

I have seen this structure used somewhere. Nevertheless, I will do a proper research about this issue. If it reveals to be a problem (with the suggested implications), I will update the underlaying member variables instead (set of initial or final states and transition relation).

As I said, it is just a hunch, that it could be dangerous, maybe Ondra knows more about this. So, I would maybe investigate a little about this, to avoid potential issues in future. I used to program in C++ a lot and didn't encountered this gem. I kind of like it, since it's basically a defilement of the language :-D.

You are not wrong. I can see how this could be a bad practise at minimal, maybe even dangerous in and of itself. You have certainly piqued my interest.

@martinhruska
Copy link
Member

I don't have strong opinion on *this= stuff. It is definitely weird since you need to think about what happens after the assignment. However, It's just a layman opinion without a strong argument for or against.

@Adda0
Copy link
Collaborator Author

Adda0 commented Aug 12, 2022

All issues with this PR are resolved and the changes we agreed upon are applied. Feel free to review the new changes (just some renamed functions, as suggested). Otherwise, from where I stand, the PR is finished. After these potential reviews, the PR is ready to be merged.

@martinhruska
Copy link
Member

Ok, I'm gonna merge it.

@martinhruska martinhruska merged commit a399370 into VeriFIT:devel Aug 12, 2022
@Adda0 Adda0 deleted the segmentation branch August 12, 2022 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants