-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor to suport SQLAlchemy 2.0 migration #14
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You asked for it so I had a look and left a couple of dumb questions 🙃 But I’ll leave the actual approval to someone who’s a little more familiar with the codebase and possible implications on Aleph.
Dockerfile
Outdated
COPY . /ftmstore | ||
WORKDIR /ftmstore | ||
|
||
RUN pip3 install --no-cache-dir followthemoney SQLAlchemy postgres pytest pyicu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I never really understood the Python packaging conventions etc., but shouldn’t we install the dependencies specified in setup.py
?
Also, out of curiosity, why is pyicu
required?
RUN pip3 install --no-cache-dir followthemoney SQLAlchemy postgres pytest pyicu | |
RUN pip3 install --no-cache-dir pytest pyicu | |
RUN pip3 install --no-cache-dir -e .[postgresql] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm installing it in response to a warning emitted from normality
: ICUWarning: Install 'pyicu' for better text transliteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are installing the dependencies listed in setup.py
(plus pytest
). I am unsure whether the intention was to say I missed a dependency or whether I installed something that wasn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, my wording was unclear.
I meant to suggest installing the dependencies by referring setup.py
with pip install -e .[postgresql]
rather than hardcoding them in the Dockerfile.
But as I'm not super knowledgeable about all the package tooling in the Python ecosystem, there may be something I'm overlooking why that's not possible!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we are hardcoding the dependencies because there's no easy way of saying pip install dependencies from setup.py, but not the project itself
. The typical workaround here would be to have a requirements.txt
file and then setup.py
to read that (see for instance https://stackoverflow.com/questions/26900328/install-dependencies-from-setup-py). I would advise doing that because it would help clear up the project structure, especially for local development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for clarifying! I don’t have any opinion on this at all, just trying to learn something, so I have a follow-up question:
When I have a package in /foo
and I run pip install -e /foo
it installs the package (and also its requirements). So if I have another project bar
that uses foo
as a dependency there could be a conflict. I thought that’s the reason why we use virtual environments?
So the alternative is to keep a separate requirements.txt
for development. But then you start maintaining the same list of dependencies in requirements.txt
and setup.py
? 🤯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The virtual environment doesn't protect you from transitive dependency hell, unfortunately. But stating your requirements in a txt file makes it easier to create a virtualenv (or provision a Docker image). The next step here would be to use a lock file as well.
You don't need to have the dependencies listed twice, because you can have setup.py
read the requirements.txt
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a requirements.txt
file, added the dependencies. I kept the extras_require
in setup.py
as per best practices. Also, I checked the setup.py
for aleph and it seems like it doesn't read the requitements.txt file so I didn't either. This is up for change if we feel we should implement it differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @catileptic ! I feel like we don't pip install aleph
, so it's not that important, whereas we do depend on followthemoney-store
in a few places, so it would be great for setup.py
and requirements.txt
not to drift apart. Otherwise we're running the risk of developing against something different than what we're installing.
@@ -31,8 +31,7 @@ jobs: | |||
pip install coverage wheel pytest | |||
pip install -e ".[postgresql]" | |||
- name: Run the tests | |||
run: | | |||
pytest | |||
run: make test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is personal preference, but I’m not a huge fan of running tests in CI within Docker containers if the environment (like in this case) is reasonable simple. The tests could probably run in ~10 seconds, but now it takes 2+ mins because the Docker image has to be built etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are running some tests against a postgres database which we need to spin up a Docker image for. Does this feel like an acceptable compromise, in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I agree with Till. But we need the db container and since we're mounting the source code as a volume into the container this is an acceptable drawback for development (i.e. allows quick iterations) and it follows the same patterns we have in other repos (not that that's an excuse).
06e26a1
to
bb71c98
Compare
d1db6bc
to
bb71c98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, @catileptic ! Thanks for fixing everything and for adding a development environment in the process
Refactoring work is required to make FTM-store work with the latest version of SQLAlchemy (2.0+).
Changes made:
self.engine
as a parameter passed to theMetaData()
object instantiationInspector.from_engine
, usesqlalchemy.inspect()
called withself.engine
as a parameter to obtain theinspect
object that theStore
class inftmstore
was usingsqlalchemy.sql.functions
directly toselect()
execute()
on a connection, not on theEngine
object