Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools refactoring #24

Merged
merged 34 commits into from
Feb 26, 2021
Merged

Tools refactoring #24

merged 34 commits into from
Feb 26, 2021

Conversation

simpletonDL
Copy link
Contributor

I think all tools should be part of the same python script. So in terms of argparse we can use subparsers. Something like this: python3 init tool_1 [tool_1 args], python3 init tool_2 [tool_2 args] and so on. This will cause CFPQ_Data to be formed as a single package.

@simpletonDL simpletonDL requested a review from gsvgit July 4, 2020 22:25
init.py Outdated
) as out_file:
for triple in graph:
out_file.write('%s %s %s \n' % (triple[0], triple[1], triple[2]))
#!/usr/bin/python3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems, that script is not only for initialization from now. Should it be renamed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to rename it, but I didn't come up with anything. Can you suggest a better name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I cannot. Maybe just main.

return extensions is None or any([file.endswith(ext) for ext in extensions])


def file_has_size(file, left_bound=None, right_bound=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why file size, not stored graph size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be both options. There are problems with the size of graphs. Exactly to compute it for rdf files at first we need to parse this files. This takes a long time. So maybe the graph size should be approximated by the number of lines in the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. We plan to add metadata for graphs (number of vertices and edges). So, can we use this data here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course. I can add additional flags and corresponding filters when the metadata appears.

@gsvgit gsvgit mentioned this pull request Sep 16, 2020
8 tasks
@gsvgit gsvgit merged commit 67e420e into master Feb 26, 2021
@vadyushkins vadyushkins deleted the tools_refactoring branch April 15, 2021 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants