-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding input and format arguments to cli tool #6
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double-check the commented points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, look at the following complaints from cd ystafdb; pylama -i E501
(pylama ignoring the max 80 chars per line, since the repo does not have a config for this:
ystafdb_metadata.py:1:1: W0611 '.data_dir' imported but unused [pyflakes]
ystafdb_metadata.py:23:1: C901 'generate_ystafdb_metadata_uris' is too complex (18) [mccabe]
ystafdb_metadata.py:38:5: E303 too many blank lines (2) [pycodestyle]
ystafdb_metadata.py:60:1: W0612 local variable 'author' is assigned to but never used [pyflakes]
__init__.py:10:1: E402 module level import not at top of file [pycodestyle]
__init__.py:11:1: E402 module level import not at top of file [pycodestyle]
__init__.py:16:41: W292 no newline at end of file [pycodestyle]
graph_common.py:6:1: E402 module level import not at top of file [pycodestyle]
graph_common.py:7:1: E402 module level import not at top of file [pycodestyle]
config_parser.py:5:1: E302 expected 2 blank lines, found 1 [pycodestyle]
config_parser.py:5:1: C901 'get_config_data' is too complex (14) [mccabe]
config_parser.py:60:5: E303 too many blank lines (2) [pycodestyle]
config_parser.py:119:30: W292 no newline at end of file [pycodestyle]
provenance_uris.py:1:1: W0611 '.filesystem.write_graph' imported but unused [pyflakes]
provenance_uris.py:6:1: W0611 'pathlib.Path' imported but unused [pyflakes]
provenance_uris.py:13:1: W0612 local variable 'purl' is assigned to but never used [pyflakes]
provenance_uris.py:41:1: W0612 local variable 'purl' is assigned to but never used [pyflakes]
provenance_uris.py:113:13: W292 no newline at end of file [pycodestyle]
foaf.py:17:1: W0612 local variable 'purl' is assigned to but never used [pyflakes]
bin/ystafdb.py:14:1: W0611 'docopt.docopt' imported but unused [pyflakes]
bin/ystafdb.py:35:38: E231 missing whitespace after ',' [pycodestyle]
bin/ystafdb.py:35:44: E231 missing whitespace after ',' [pycodestyle]
$ cd ystafdb | ||
``` | ||
##### Download Base Data | ||
Before progressing the installation, the base ystafdb data must be downloaded, and placed in a folder of your choosing, inside the repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inside the repo <- it can be "anywhere", not necessarily inside the repo.
|
||
Options: | ||
-h --help Show this screen. | ||
--version Show version. | ||
|
||
""" | ||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docopt was used at the beginning ... let's keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my comment was not clear.
I meant to say that I would have liked to use docopt
instead of argparse. But we should remove the import of docopt
if we are not using it anymore.
@@ -1,2 +1,7 @@ | |||
appdirs | |||
docopt | |||
pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move the dependencies to the requirements.txt
file, an install that uses python setup.py
without pip install -r requirements.txt
won't pull the dependencies anymore.
I personally prefer to have things managed along the setup.py [or even better, setup.cfg]. Please take a look at https://packaging.python.org/discussions/install-requires-vs-requirements/#install-requires-vs-requirements-files and decide which one you prefer. I find it makes life easier to have it only in the setup.[py,cfg]
|
||
Options: | ||
-h --help Show this screen. | ||
--version Show version. | ||
|
||
""" | ||
import argparse | ||
from docopt import docopt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you won't use docopt anymore for the CLI, remove this import.
docopt is fun: it allows to describe the cli options at the same time as the docstrings of the function.
Do you have a reason why you prefer argparse ?
Update to summarize comment:
Also, before merging, The following scenario does not work: (I install the cli, and use it outside of the repo in the filesystem)
Here is the error:
And if I run it from the repo dir
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if the config.json
file is totally specific to the yale db we're parsing here, there is little need for the user to know about this file.
I suggest we keep the config.json
as a "package" resource, and load it from there, so that the cli can be used from anywhere in the filesystem.
From my understanding of the usage of this repo / package, it is mostly for the cli that it exists. Cli is first class citizen, and keeping the config.json
inside the package seems better suited because you wouldn't have to clone the repo to use it. Of course, if you think that the user should be able to provide his/her own config.json
, then we should open another issue to add this feature as an option to the cli.
Here is a little patch that implements what I suggest (to keep the config.json
inside the package).
If you agree @IKnowLogic @kuzeko , I can commit it later.
edit: I removed the patch from the message contents because I pushed a commit (815e98c)
For the scenario I mention in my previous comment I think that we are trying to forcefully load the data as a "package resource", but this is not the case anymore if we allow the user to supply a directory with the data. Here are my working assumptions regarding
That is why I insist on the location of the |
The following patch can be used to load files from the path, instead of trying to load them as resources. |
I agree with request 1. Actually, since we have a repository for each dataset, the config file is not really needed at all, but let's keep it for now. For request 2 I think you are correct. The only reason we don't have the csv files in the data directory is that we don't want to redistribute the ystafdb dataset. Therefore we might as well omit to load them as a package resource. If @kuzeko also agrees, you are very welcome to push the changes. |
Looks fine to me. Please proceed :) |
Reason for pull request:
This pull request adds an
<input/dir>
argument to the cli tool. This enables the user to specify the location of the ystafdb data, instead of being a static path. The pull request also adds checks, to verify csv. file paths exists, before using them. This, along with better installation documentation, should resolve Issue 1 and Issue 4. To have consistent cli options, theregenerate
option is renamed to-o
foroutput
.The pull request also adds documentation for how to use an existing virtual environment with this software, which solves Issue 2
File overview
README.md
:setup.py
:ystafdb/filesystem.py
: