Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an interactive mode #104

Open
philipmat opened this issue Aug 16, 2020 · 3 comments
Open

Provide an interactive mode #104

philipmat opened this issue Aug 16, 2020 · 3 comments
Labels

Comments

@philipmat
Copy link
Owner

Instead of reading through the documentation and learning all the parameters to pass to the program, what if out of the box we provided an interactive run mode.

$ python discogsxml2db.py

Welcome do discogs-xml2db.
What would you like to do:
1. Download latest data dump files
2. Export data dump files to csv
3. Import csv files into database
4. All of the above
? _

Then each step could ask the questions it needs to. For example, step 2 could ask:

discogs-xml2db: Export Data Dump Files to CSV
Please enter the location where the Discogs data dumps have been downloaded: /tmp/discogs/dumps/2020-08/
Please enter the location where you want to place the CSV files: /tmp/discogs/csv/

etc

If "4" is chosen, then all the necessary question could be ask ahead and the steps would execute in sequence without user interaction.

The settings could be saved up locally so that subsequent executions would preserve the choices being made.

@ijabz
Copy link
Collaborator

ijabz commented Aug 17, 2020

I dont know if necessary to provide interacive mode, maybe just make it simpler

This is the total of what I do :
cd $HOME/code/discogs-xml2db
rm nohup.out; nohup ./get_latest_dumps.sh &
tail -f nohup.out
cd speedup
rm nohup,out; nohup ./import.sh &
tail -f nohup.out

with import.sh containing the following:

rm -fr csv-dir
rm -fr dump-dir
mkdir -p csv-dir
mkdir -p dump-dir
mv ../*gz dump-dir
python3 exporter.py --bz2 --apicounts --export master --export label --export artist --export release dump-dir csv-dir
python3 postgresql/psql.py < postgresql/sql/DropTables.sql
python3 postgresql/psql.py < postgresql/sql/CreateTables.sql

@berz
Copy link
Contributor

berz commented Aug 17, 2020

I don't think adding an interactive mode would make discogs-xml2db much easier to use as the people using it should already know how to look up documentation and use the command line. However I would like to make the parameters and documentation as clear as possible. Some possible improvements:

  • Promote speedup code to top level and remove the classic version. I see you are working on this.
  • get_latest_dumps: add parameter specifying a directory to download the files to. Create this directory if it doesn't exist. This prevents a mkdir and two cd commands for many users and allows us to simplify the documentation.
  • get_latest_dumps.sh: automatically dowload checksum file
  • exporter.py: add #! line so we can execute it directly instead of calling it using python3
  • exporter.py: take paths to dump files as arguments instead of specifying a directory and adding a --export parameter for every dump type (--export artist --export label ...)

These changes would make the commands pretty simple to understand and document. For example a mysql import would require:

# setup, can be automated when seting up the environment
$ git clone https://github.com/philipmat/discogs-xml2db.git
$ cd discogs-xml2db
$ sudo pip3 install -r requirements.txt
$ nano mysql/mysql.conf
$ mysql/exec_sql.sh < mysql/CreateTables.sql

# actual import
$ get_latest_dumps.sh xml-dumps
$ sha256sum -c xml-dumps/discogs_*_CHECKSUM.txt
$ exporter.py --bz2 --apicounts xml-dumps/*.xml.gz csvdir
$ mysql/importcsv.sh /csvdir/*
$ mysql/exec_sql.sh < mysql/AssignPrimaryKeys.sql

I was planning on making pull requests for most of these changes but have been pretty busy the last couple of weeks. I should have some time for this come september.

@philipmat
Copy link
Owner Author

@berz - thank you for your thoughts.

Would you mind opening up an issue for each of those items in your list?

It would make it easier to track them.

@philipmat philipmat removed the org label Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants