Skip to content

Database layer for scenario and execute list#266

Merged
jenhagg merged 9 commits intodevelopfrom
jon/postgres
Aug 27, 2020
Merged

Database layer for scenario and execute list#266
jenhagg merged 9 commits intodevelopfrom
jon/postgres

Conversation

@jenhagg
Copy link
Copy Markdown
Collaborator

@jenhagg jenhagg commented Aug 24, 2020

Purpose

Implement storage api for scenario and execute list using postgres db, aiming to be consistent with the existing csv implementation. Provide base class SqlStore to simplify usage via a context manager and reduce boilerplate query definitions in the ScenarioTable and ExecuteTable classes that inherit from this.

What the code does

The shared logic is in sql_store.py, and sql implementations have been added to scenario_list.py and execute_list.py alongside the csv version. Tests are probably the best way to see how these are used. The readme in powersimdata/data_access describes how to run a local db container which is used for tests, and can be connected to manually (also see the note there, I haven't checked in the sql schema yet so if you want to try it out, check this gist or message me and I will help).

What the code doesn't do, is change how we currently store data. Nothing here is actually used yet, but the PR is getting on the larger side so it's probably a good point to check in. Next steps will include having schema created automatically, and setting up a container on the server to migrate existing data into, while having the code write to both csv and sql.

Time to review

~ 40 mins

@jenhagg jenhagg self-assigned this Aug 24, 2020
@jenhagg jenhagg added this to the WTT90s milestone Aug 24, 2020
@jenhagg jenhagg added the new feature Feature that is currently in progress. label Aug 24, 2020
@jenhagg jenhagg linked an issue Aug 25, 2020 that may be closed by this pull request
@rouille
Copy link
Copy Markdown
Collaborator

rouille commented Aug 25, 2020

I tried to pipenv sync and it failed:

[~/CEM/PowerSimData] (jon/postgres) brdo$ pipenv sync
Installing dependencies from Pipfile.lock (3d7fb2)…
  🐍   ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 3/3 — 00:00:00
An error occurred while installing psycopg2==2.8.5 --hash=sha256:ac5b23d0199c012ad91ed1bbb971b7666da651c6371529b1be8cbe2a7bf3c3a9 --hash=sha256:440a3ea2c955e89321a138eb7582aa1d22fe286c7d65e26a2c5411af0a88ae72 --hash=sha256:d3b29d717d39d3580efd760a9a46a7418408acebbb784717c90d708c9ed5f055 --hash=sha256:f7d46240f7a1ae1dd95aab38bd74f7428d46531f69219954266d669da60c0818 --hash=sha256:6b306dae53ec7f4f67a10942cf8ac85de930ea90e9903e2df4001f69b7833f7e --hash=sha256:2327bf42c1744a434ed8ed0bbaa9168cac7ee5a22a9001f6fc85c33b8a4a14b7 --hash=sha256:2c0afb40cfb4d53487ee2ebe128649028c9a78d2476d14a67781e45dc287f080 --hash=sha256:2df2bf1b87305bd95eb3ac666ee1f00a9c83d10927b8144e8e39644218f4cf81 --hash=sha256:a0984ff49e176062fcdc8a5a2a670c9bb1704a2f69548bce8f8a7bad41c661bf --hash=sha256:6a471d4d2a6f14c97a882e8d3124869bc623f3df6177eefe02994ea41fd45b52 --hash=sha256:27c633f2d5db0fc27b51f1b08f410715b59fa3802987aec91aeb8f562724e95c --hash=sha256:acf56d564e443e3dea152efe972b1434058244298a94348fc518d6dd6a9fb0bb --hash=sha256:132efc7ee46a763e68a815f4d26223d9c679953cd190f1f218187cb60decf535! Will try again.
Installing initially failed dependencies…
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/cli/command.py", line 682, in sync
[InstallError]:       retcode = do_sync(
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/core.py", line 2890, in do_sync
[InstallError]:       do_init(
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/core.py", line 1306, in do_init
[InstallError]:       do_install_dependencies(
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/core.py", line 900, in do_install_dependencies
[InstallError]:       batch_install(
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/core.py", line 796, in batch_install
[InstallError]:       _cleanup_procs(procs, failed_deps_queue, retry=retry)
[InstallError]:   File "/usr/local/Cellar/pipenv/2020.6.2/libexec/lib/python3.8/site-packages/pipenv/core.py", line 703, in _cleanup_procs
[InstallError]:       raise exceptions.InstallError(c.dep.name, extra=err_lines)
[pipenv.exceptions.InstallError]: Collecting psycopg2==2.8.5
[pipenv.exceptions.InstallError]:   Using cached psycopg2-2.8.5.tar.gz (380 kB)
[pipenv.exceptions.InstallError]:     ERROR: Command errored out with exit status 1:
[pipenv.exceptions.InstallError]:      command: /Users/brdo/.local/share/virtualenvs/PowerSimData-MpUK62nT/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-install-zhegjno8/psycopg2/setup.py'"'"'; __file__='"'"'/private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-install-zhegjno8/psycopg2/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd
[pipenv.exceptions.InstallError]:          cwd: /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-install-zhegjno8/psycopg2/
[pipenv.exceptions.InstallError]:     Complete output (23 lines):
[pipenv.exceptions.InstallError]:     running egg_info
[pipenv.exceptions.InstallError]:     creating /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd/psycopg2.egg-info
[pipenv.exceptions.InstallError]:     writing /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd/psycopg2.egg-info/PKG-INFO
[pipenv.exceptions.InstallError]:     writing dependency_links to /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd/psycopg2.egg-info/dependency_links.txt
[pipenv.exceptions.InstallError]:     writing top-level names to /private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd/psycopg2.egg-info/top_level.txt
[pipenv.exceptions.InstallError]:     writing manifest file '/private/var/folders/bk/t227kx5j60gbnh7gjzf3g_c80000gn/T/pip-pip-egg-info-18bcqgjd/psycopg2.egg-info/SOURCES.txt'
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     Error: pg_config executable not found.
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     pg_config is required to build psycopg2 from source.  Please add the directory
[pipenv.exceptions.InstallError]:     containing pg_config to the $PATH or specify the full executable path with the
[pipenv.exceptions.InstallError]:     option:
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:         python setup.py build_ext --pg-config /path/to/pg_config build ...
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     or with the pg_config option in 'setup.cfg'.
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     If you prefer to avoid building psycopg2 from source, please install the PyPI
[pipenv.exceptions.InstallError]:     'psycopg2-binary' package instead.
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     For further information please check the 'doc/src/install.rst' file (also at
[pipenv.exceptions.InstallError]:     <https://www.psycopg.org/docs/install.html>).
[pipenv.exceptions.InstallError]:     
[pipenv.exceptions.InstallError]:     ----------------------------------------
[pipenv.exceptions.InstallError]: ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Couldn't install package: psycopg2
 Package installation failed...
  ☤  ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/1 — 00:00:01

Do you know what is going on with the psycopg2 package?

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 25, 2020

@rouille give this a try brew install postgresql. I was able to pipenv sync on a clean virtual environment but might still need psycopg2-binary to use the package.. I'll do a bit more investigation

update: try brew install libpq first, that seems to work on my machine and simpler than full postgres install

@rouille
Copy link
Copy Markdown
Collaborator

rouille commented Aug 25, 2020

@rouille give this a try brew install postgresql. I was able to pipenv sync on a clean virtual environment but might still need psycopg2-binary to use the package.. I'll do a bit more investigation

We cannot access postgres within the container?

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 25, 2020

@rouille give this a try brew install postgresql. I was able to pipenv sync on a clean virtual environment but might still need psycopg2-binary to use the package.. I'll do a bit more investigation

We cannot the postgres within the container?

We can, I think it's only needed for the dependencies to build psycopg2, you don't actually have to run postgres. That should also get you tools like psql which you can use to query postgres running in a container.

@BainanXia
Copy link
Copy Markdown
Collaborator

@jon-hagg I encountered the same failure as @rouille did above when I tried pipenv sync after brew install libpq. However, with a full postgresql install, it works.

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 25, 2020

@jon-hagg I encountered the same failure as @rouille did above when I tried pipenv sync after brew install libpq. However, with a full postgresql install, it works.

Cool, thanks for checking. I had an existing postgres installation so wasn't sure if it was needed. One thing about the libpq option - did you add the path to bashrc (or zshrc) and reload after brew install but before pipenv sync? When I did that it fixed a different (runtime) error I was getting with psycopg2.

@BainanXia
Copy link
Copy Markdown
Collaborator

@jon-hagg I encountered the same failure as @rouille did above when I tried pipenv sync after brew install libpq. However, with a full postgresql install, it works.

Cool, thanks for checking. I had an existing postgres installation so wasn't sure if it was needed. One thing about the libpq option - did you add the path to bashrc (or zshrc) and reload after brew install but before pipenv sync? When I did that it fixed a different (runtime) error I was getting with psycopg2.

No, I didn't do that. I simply tried pipenv sync after brew install, and it gave me that failure as I mentioned above.

@BainanXia
Copy link
Copy Markdown
Collaborator

I spun up the local db container as described in the Readme and tried to run the test. I've got following error message:

=============================================================================================== short test summary info ================================================================================================
FAILED powersimdata/data_access/tests/test_execute_list_store.py::test_err_handle - psycopg2.OperationalError: FATAL:  database "psd" does not exist
FAILED powersimdata/data_access/tests/test_scenario_list_store.py::test_add_entry_missing_required_raises - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_execute_list_store.py::test_select_no_limit - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_execute_list_store.py::test_select_with_limit - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_execute_list_store.py::test_add_entry - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_execute_list_store.py::test_update_entry - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_execute_list_store.py::test_delete_entry - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_scenario_list_store.py::test_select_no_limit - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_scenario_list_store.py::test_select_with_limit - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_scenario_list_store.py::test_add_entry - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_scenario_list_store.py::test_delete_entry - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_sql_store.py::test_select_where - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_sql_store.py::test_select_all - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_sql_store.py::test_insert - psycopg2.OperationalError: FATAL:  database "psd" does not exist
ERROR powersimdata/data_access/tests/test_sql_store.py::test_delete - psycopg2.OperationalError: FATAL:  database "psd" does not exist
================================================================================= 2 failed, 216 passed, 13 errors in 89.22s (0:01:29) ==================================================================================
(PowerSimData) bxia PowerSimData (jon/postgres) $ 

I believe I didn't create the database properly. Once I have the local db container running, how to configure the test database according to the schema @jon-hagg mentioned in the gist?

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 25, 2020

@BainanXia sorry, I'll add some instructions to the readme.

@rouille
Copy link
Copy Markdown
Collaborator

rouille commented Aug 26, 2020

If we have to install postgres on our system in order to be able to install psycopg2 that is required in sql_store.py, what is the point of using docker?

@kasparm
Copy link
Copy Markdown
Contributor

kasparm commented Aug 26, 2020

We should be able to use the dev server to test this.
Also, on linux you will need to install libpq-dev to have the database driver.

Copy link
Copy Markdown
Collaborator

@BainanXia BainanXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the guidance. I finally got all tests passed.

(PowerSimData) bxia PowerSimData (jon/postgres) $ pytest .
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.8.3, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: /Users/bainanxia/OneDrive - Gates Ventures/Documents/GitHub/PowerSimData, inifile: pytest.ini
collected 231 items                                                                                                                                                                                                    

powersimdata/data_access/tests/test_execute_list_store.py ......                                                                                                                                                 [  2%]
powersimdata/data_access/tests/test_scenario_list_store.py .....                                                                                                                                                 [  4%]
powersimdata/data_access/tests/test_sql_store.py ....                                                                                                                                                            [  6%]
powersimdata/design/tests/test_object_persistence.py ...                                                                                                                                                         [  7%]
powersimdata/design/tests/test_resource_target_manager.py ...............                                                                                                                                        [ 14%]
powersimdata/design/tests/test_scenario_info.py ........                                                                                                                                                         [ 17%]
powersimdata/design/tests/test_strategies.py ................                                                                                                                                                    [ 24%]
powersimdata/design/tests/test_target_manager_input.py ...                                                                                                                                                       [ 25%]
powersimdata/design/tests/test_transmission.py .............................................................                                                                                                     [ 52%]
powersimdata/input/tests/test_change_table.py ............................                                                                                                                                       [ 64%]
powersimdata/input/tests/test_grid.py ............................                                                                                                                                               [ 76%]
powersimdata/input/tests/test_transform_grid.py ..............                                                                                                                                                   [ 82%]
powersimdata/input/tests/test_transform_profile.py ...................                                                                                                                                           [ 90%]
powersimdata/tests/test_mocks.py ..........                                                                                                                                                                      [ 95%]
powersimdata/utility/tests/test_distance.py ...                                                                                                                                                                  [ 96%]
powersimdata/utility/tests/test_helpers.py .                                                                                                                                                                     [ 96%]
powersimdata/utility/tests/test_transfer_data.py .......                                                                                                                                                         [100%]

============================================================================================ 231 passed in 77.36s (0:01:17) ============================================================================================

For people who would like to setup a database within a local container for testing purpose:

  • Have docker engine and docker compose installed on the local machine
  • Do docker-compose -f stack.yml up in the directory of file stack.yml
  • In a new terminal tab, create schema.sql, copy and paste what @jon-hagg has written in the gist.
  • Do psql -U postgres -h localhost, using password 'example' to get into psql shell of the container
  • Do CREATE DATABASE psd; then \c psd
  • In the database shell, do \i schema.sql, to check the two tables are created in the psd database successfully, one could run \dt
  • With the database setup and running in the local container, run pytest.

@kasparm
Copy link
Copy Markdown
Contributor

kasparm commented Aug 27, 2020

I am testing this on the dev server. I have the db running in a container and can connect with the database via psql and confirm the db is running and setup. The tests fails saying the table does not exist:

_________________________________________________________________ test_select_no_limit __________________________________________________________________

store = <powersimdata.data_access.tests.test_execute_list_store.NoEffectSqlStore object at 0x7fdcbb664940>

    @pytest.mark.integration
    def test_select_no_limit(store):
>       store.add_entry(_get_test_row())

powersimdata/data_access/tests/test_execute_list_store.py:42:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata/data_access/execute_list.py:41: in add_entry
    self.cur.execute(sql, (scenario_id, status,))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <cursor object at 0x7fdcbab41740; closed: 0>
query = Composed([SQL('INSERT INTO '), Identifier('execute_list'), SQL(' ('), Composed([Identifier('id'), SQL(','), Identifier('status')]), SQL(') VALUES ('), Composed([Placeholder(''), SQL(','), Placeholder('')]), SQL(')')])
vars = (9001, 'created')

    def execute(self, query, vars=None):
        self.index = OrderedDict()
        self._query_executed = True
>       return super(DictCursor, self).execute(query, vars)
E       psycopg2.errors.UndefinedTable: relation "execute_list" does not exist
E       LINE 1: INSERT INTO "execute_list" ("id","status") VALUES (9001,'cre...
E                           ^

../.local/share/virtualenvs/PostREISE-fKepjoJ1/lib/python3.8/site-packages/psycopg2/extras.py:143: UndefinedTable
_____________________________________________

Any suggestion how to dig further?

@BainanXia
Copy link
Copy Markdown
Collaborator

@kasparm Did you create the two tables according to the schema.sql in the gist created by @jon-hagg above in the post?

@kasparm
Copy link
Copy Markdown
Contributor

kasparm commented Aug 27, 2020

@BainanXia yes. Feel free to check it out. The db container is running on the dev server and you should be able to connect.

@BainanXia
Copy link
Copy Markdown
Collaborator

@kasparm Try testing it again. It should work now.

@kasparm
Copy link
Copy Markdown
Contributor

kasparm commented Aug 27, 2020

What did you change?

@BainanXia
Copy link
Copy Markdown
Collaborator

@kasparm The two tables should be created in the psd database after connecting to it (when we have psd=#) rather than in the psqlshell (when the prompt is postgres=#).

@kasparm
Copy link
Copy Markdown
Contributor

kasparm commented Aug 27, 2020

Thanks for the help. So I had created the tables in postgres rather than in the psd database.
@jon-hagg would it be possible to spin up the container on the dev server for the future to test the code? This will make the review process easier.

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 27, 2020

Running on the dev server is one option, though I was thinking we can automate the setup for running locally, just need a bit more work on that. I believe we can run the container in github actions as well.

@rouille
Copy link
Copy Markdown
Collaborator

rouille commented Aug 27, 2020

Sorry guys, I am kind of lost. Would it be possible to have a step by step refresher on what to install and how to run it?

@jenhagg
Copy link
Copy Markdown
Collaborator Author

jenhagg commented Aug 27, 2020

@rouille see if the readme I added is helpful, combined with the schema.sql file from the gist. If it's unclear or missing anything else (besides the gist, which will eventually be checked in), I want to update it.

@BainanXia
Copy link
Copy Markdown
Collaborator

For people who would like to setup a database within a local container for testing purpose:

  • Have docker engine and docker compose installed on the local machine
  • Do docker-compose -f stack.yml up in the directory of file stack.yml
  • In a new terminal tab, create schema.sql, copy and paste what @jon-hagg has written in the gist.
  • Do psql -U postgres -h localhost, using password 'example' to get into psql shell of the container
  • Do CREATE DATABASE psd; then \c psd
  • In the database shell, do \i schema.sql, to check the two tables are created in the psd database successfully, one could run \dt
  • With the database setup and running in the local container, run pytest.

@rouille I've written down the steps I went through to set up the local database and run the test above.

@rouille
Copy link
Copy Markdown
Collaborator

rouille commented Aug 27, 2020

For people who would like to setup a database within a local container for testing purpose:

  • Have docker engine and docker compose installed on the local machine
  • Do docker-compose -f stack.yml up in the directory of file stack.yml
  • In a new terminal tab, create schema.sql, copy and paste what @jon-hagg has written in the gist.
  • Do psql -U postgres -h localhost, using password 'example' to get into psql shell of the container
  • Do CREATE DATABASE psd; then \c psd
  • In the database shell, do \i schema.sql, to check the two tables are created in the psd database successfully, one could run \dt
  • With the database setup and running in the local container, run pytest.

@rouille I've written down the steps I went through to set up the local database and run the test above.

Do you need to install postgresql or libpq is enough?

@BainanXia
Copy link
Copy Markdown
Collaborator

@rouille Only libpq doesn't work for me. I installed postgresql as well.

@jenhagg jenhagg merged commit a1e8872 into develop Aug 27, 2020
@jenhagg jenhagg deleted the jon/postgres branch August 27, 2020 21:30
@ahurli ahurli mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature Feature that is currently in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build data access layer for scenario and execute list using sql db

4 participants