Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LB-503: Add command to connect to Timescale DB via develop.sh #767

Merged

Conversation

shivam-kapila
Copy link
Collaborator

@shivam-kapila shivam-kapila commented Mar 27, 2020

Description

Just like ./develop.sh psql we can have a command to connect to Timmescale DB directly.

Problem

There didn't exist any command to directly enter into PSQL shell and connect to timescale DB.

Solution

  • Add timescale option to ./develop.sh to enter into PSQL shell and connect to timescale DB automatically.
  • Use URLs to connect to the PSQL dbs.

Action

  • Enter the ./develop.sh timescale command in the
    Done. You can now query the Timescale Db directly.

@alastair
Copy link
Collaborator

Because this is only for development, and the local user password is known and always the same, we can specify the password in the run command so that you don't have to enter it all the time. It looks like there are a few different ways of doing this here: https://stackoverflow.com/questions/6405127/how-do-i-specify-a-password-to-psql-non-interactively/6405162
Specifying a connect string like we already have in the config file is probably a good solution (SQLALCHEMY_TIMESCALE_URI), it keeps us consistent in different parts of the project

@shivam-kapila
Copy link
Collaborator Author

@alastair As you mentioned I added URL to connect to PSQL db keeping it consistent with config.py. I tried to use variables from config.py but its not possible due to import os and spaces in variable declaration.

Using these URLs the user doesn't need to enter the passwords now. Please have a look.

Copy link
Member

@mayhem mayhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@mayhem mayhem merged commit e7a06da into metabrainz:timescale Mar 30, 2020
mayhem added a commit that referenced this pull request Jul 22, 2020
* Interim check-in

* First cut of a very hacky, lots of harded shit in place version of the timescale writer

* Add some exception handling, but the write is stil not starting up

* Update the python script invocation with -u for unbuffered output

* Push timescale writer update

* more testing

* Start changing the config from influx to timescale

* Add tables for creating the timescale db

* Interim check-in trying to work out how to connect to the ts pg

* Interim check-in due to coronavirus

* Setup timescale init features

* Fixing config.py.sample

* Remove port spec

* More connect string fixes

* Fix -f in Timescale Setup in manage.py

* Fix -f in Timescale Setup in manage.py (#761)

* Interim-check in

* LB-503: Add command to connect to Timescale DB via develop.sh (#767)

* Fix -f in Timescale Setup in manage.py

* Add command to connect to Timescale via develop.sh

* Use URL to connect to PSQL

* Sanity check check-in

* Reverting to what I had before

* Timescale tests setup

* Removed merge conflicts

* Fix travis.yml

* Continue converting things to timescale and nuking influx code

* Adding missing init call

* Another interim check-in

* Still fixing tests setup

* Port to_influx to to_timescale and fix a few syntax errors

* Fixing listenstore, fixing tests, fixing setup. A test nearly passes!

* One basic test passes!

* Fix get recent listens function

* Timescale tests patch 1

* Another fix

* Add test for listen_count view

* Add wait for timescale

* Ported the remainder of the timescale listenstore, but fully untested

* Tests for dump listens

* Fix fetch recent listens

* Remove a print

* Some minor changes

* More tests

* Add the inserted_before check back

* fix travis problem where bash -c was removed and extra \

* Use the correct type for the check

* Remove another backslash

* Convert the variable being passed to integer

* Add a sleep to make sure the listens got inserted before we create the
dump

* Add the inserted_before check back

* fix travis problem where bash -c was removed and extra \

* Use the correct type for the check

* Remove another backslash

* Convert the variable being passed to integer

* Add a sleep to make sure the listens got inserted before we create the
dump

* Use correct image in integration tests

* Timescale Listenstore tests done

* Replace influx_connection with timescale_connection

* Fix lastfm user tests

* Continue removing influx and fixing tests

* Fixing User tests

* Remove unnecessary merge logs

* Fixed test_user tests. Finally

* Fix test_index tests

* Fixed up the general integration tests, though a lot of tests are still failing, all structural problems are solved.

* Fix test_from_timescale test

* All unit tests fixed

* Remove more influx references

* Removed a bunch of useless prints and then got stock on the timestamp mess AGAIN!

* Fix insert json listens into timescale

* Fix integration tests (#809)

* Fix lastfm user tests

* Fixing User tests

* Remove unnecessary merge logs

* Fixed test_user tests. Finally

* Fix test_index tests

* Fix test_from_timescale test

* All unit tests fixed

* Fix insert json listens into timescale

* Cast to int at the right place

* ALL TESTS PASS!

* Remove prints

* A couple of minor fixes

* Fix timescale tests

* Do not make use of rmq connections until after the app is running.
If running in debug mode, close the connection, don't pool it.
This cuts down on error messages in the log.

* Modify tests to fetch exact listen_count in test environment

* Adding needed startup files

* Fix up DB initalization, add necessary files for a test server and some debugging the pipeline.
Also prefix predis keys with env so that different servers don't conflict with one another.

* Change timescale_lb to listenbrainz_ts

* Add namespace keys to the remaining cache lookups

* Performance improvements for fetching min/max timestamps. Two more continuous aggregates are helping with this...

* If no timestamps are given start off by 1 second.

* Try harder somewhat implemented. One unit test is failing and more tests need to be added for try_harder

* Fix redis tests

* Complete try harder logic

* Fix unit tests

* Fix integration tests

* Hide pager when try harder is not 0

* Update frontend tests

* Add test for try_harder

* pylint and pep8 fixes

* Hopefully fix export listens.

* Fix fetch all query

* Uncomment the PG section

* Upgrade to pg12, fix one test to have deterministic results

* Fix pep-8 and eslint issues. One eslint issue remains.

* Changed internal structure to use listened_at, track_name, user_name is unique key.

* Use the last timestamp if we have one for the user, rather than defaulting to now()

* Change recording_msid to track_name for unique listens

* Fix the failing test

* Remove GRANT statements

* Add a docstring or fetch single timestamp

* Fix query indentation.

* Adapt integration tests

* Remove test specific behaviour

* Fix GRANT removal consequences

* Imrpove time_range support and add test

* Change try_harder to search_larger_time_range

* Remove log statement

* Make spark dumps the main dumps and stop dumping per user.

* Get rid of the old style dumps and make spark dumps the main
dumping method. Rewrite full dump code to query by month. Test have not been updated.

* Further progress towards simplified dumps. Some tests still need fixing.

* Update tests to reflect new dump system

* Fix more tests and bugs in dumping code

* Remove all sleep statements from timescale tests. They *shouldn't* be needed.

* Make the created field NOT NULL, since the import script will clean up
timestamps properly. Remove some useless less and improve others to be simpler.

* Fix dump_manager tests. What a fucking ordeal so much wasted time!

* Improve tests by mocking the min/max timestamp collection

* Adding rough draft of transmogrify

* Write to a per month dir struct

* Move the dump transmogrifyer to manage.py

* Remove the old transmogrify script

* Use the prod postgres instance

* Create a new endpoint for fetching listens

* Fix tests related to fetch listen count API

* Fix one last failing test

* Automatically generate spark dumps as it should be happening.

* Make the mogrifier work and act exactly like the current spark dumps

* If a list is already passed into to make list, just return it.

* Fix spark dump filename and make converting to spark rows more resilient

* Fix timestamp in spark dump and also create spark dumps for incremental

* Fix number of expected dumps in dump manager tests

* Adjust expected dump counts.

* Minor pep-8 bs

* Fix snaphshot test

Co-authored-by: shivam-kapila <shivamkapila4@gmail.com>
Co-authored-by: Param Singh <iliekcomputers@gmail.com>
Co-authored-by: Ishaan Shah <ishaan.n.shah@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants