Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor to suport SQLAlchemy 2.0 migration #14

Merged
merged 9 commits into from
Apr 18, 2023

Conversation

catileptic
Copy link
Contributor

@catileptic catileptic commented Apr 4, 2023

Refactoring work is required to make FTM-store work with the latest version of SQLAlchemy (2.0+).

Changes made:

@catileptic catileptic marked this pull request as draft April 4, 2023 12:43
@catileptic catileptic changed the title Add Dockerfile, Makefile. refactor to suport SQLAlchemy 2.0 migration Apr 6, 2023
@catileptic catileptic marked this pull request as ready for review April 12, 2023 13:45
Copy link

@tillprochaska tillprochaska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You asked for it so I had a look and left a couple of dumb questions 🙃 But I’ll leave the actual approval to someone who’s a little more familiar with the codebase and possible implications on Aleph.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated
COPY . /ftmstore
WORKDIR /ftmstore

RUN pip3 install --no-cache-dir followthemoney SQLAlchemy postgres pytest pyicu

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I never really understood the Python packaging conventions etc., but shouldn’t we install the dependencies specified in setup.py?

Also, out of curiosity, why is pyicu required?

Suggested change
RUN pip3 install --no-cache-dir followthemoney SQLAlchemy postgres pytest pyicu
RUN pip3 install --no-cache-dir pytest pyicu
RUN pip3 install --no-cache-dir -e .[postgresql]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm installing it in response to a warning emitted from normality: ICUWarning: Install 'pyicu' for better text transliteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are installing the dependencies listed in setup.py (plus pytest). I am unsure whether the intention was to say I missed a dependency or whether I installed something that wasn't needed.

Copy link

@tillprochaska tillprochaska Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, my wording was unclear.

I meant to suggest installing the dependencies by referring setup.py with pip install -e .[postgresql] rather than hardcoding them in the Dockerfile.

But as I'm not super knowledgeable about all the package tooling in the Python ecosystem, there may be something I'm overlooking why that's not possible!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are hardcoding the dependencies because there's no easy way of saying pip install dependencies from setup.py, but not the project itself. The typical workaround here would be to have a requirements.txt file and then setup.py to read that (see for instance https://stackoverflow.com/questions/26900328/install-dependencies-from-setup-py). I would advise doing that because it would help clear up the project structure, especially for local development.

Copy link

@tillprochaska tillprochaska Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for clarifying! I don’t have any opinion on this at all, just trying to learn something, so I have a follow-up question:

When I have a package in /foo and I run pip install -e /foo it installs the package (and also its requirements). So if I have another project bar that uses foo as a dependency there could be a conflict. I thought that’s the reason why we use virtual environments?

So the alternative is to keep a separate requirements.txt for development. But then you start maintaining the same list of dependencies in requirements.txt and setup.py? 🤯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtual environment doesn't protect you from transitive dependency hell, unfortunately. But stating your requirements in a txt file makes it easier to create a virtualenv (or provision a Docker image). The next step here would be to use a lock file as well.

You don't need to have the dependencies listed twice, because you can have setup.py read the requirements.txt file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a requirements.txt file, added the dependencies. I kept the extras_require in setup.py as per best practices. Also, I checked the setup.py for aleph and it seems like it doesn't read the requitements.txt file so I didn't either. This is up for change if we feel we should implement it differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @catileptic ! I feel like we don't pip install aleph, so it's not that important, whereas we do depend on followthemoney-store in a few places, so it would be great for setup.py and requirements.txt not to drift apart. Otherwise we're running the risk of developing against something different than what we're installing.

Dockerfile Show resolved Hide resolved
ftmstore/dataset.py Show resolved Hide resolved
@@ -31,8 +31,7 @@ jobs:
pip install coverage wheel pytest
pip install -e ".[postgresql]"
- name: Run the tests
run: |
pytest
run: make test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is personal preference, but I’m not a huge fan of running tests in CI within Docker containers if the environment (like in this case) is reasonable simple. The tests could probably run in ~10 seconds, but now it takes 2+ mins because the Docker image has to be built etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are running some tests against a postgres database which we need to spin up a Docker image for. Does this feel like an acceptable compromise, in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree with Till. But we need the db container and since we're mounting the source code as a volume into the container this is an acceptable drawback for development (i.e. allows quick iterations) and it follows the same patterns we have in other repos (not that that's an excuse).

Makefile Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
@catileptic catileptic force-pushed the feature/sqlalchemy-2-migration branch from 06e26a1 to bb71c98 Compare April 17, 2023 13:59
@catileptic catileptic force-pushed the feature/sqlalchemy-2-migration branch from d1db6bc to bb71c98 Compare April 17, 2023 14:58
Copy link
Contributor

@stchris stchris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, @catileptic ! Thanks for fixing everything and for adding a development environment in the process

@catileptic catileptic merged commit 3e58d77 into main Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants