Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework update repository to getML version 1.4.0 #17

Merged
merged 1 commit into from
Mar 2, 2024

Conversation

Urfoex
Copy link
Contributor

@Urfoex Urfoex commented Feb 14, 2024

  • Update notebooks to work with current versions of getML, featuretools, tsfresh, prophet ...
  • Clean up contents of notebooks
  • Add docker and compose files for easy local testing
  • Add requirements with pinned versions for Python 3.8 and 3.11
  • move binder to own project

@Urfoex Urfoex added the enhancement New feature or request label Feb 14, 2024
@Urfoex Urfoex requested review from alxn4, liuzicheng1987, srnnkls and a team February 14, 2024 15:48
Urfoex added a commit that referenced this pull request Feb 15, 2024
…2 #33 #34 #36 #37 #38 #39 #40 Updated and cleaned-up notebooks

using woodwork-types to fit getml and table description

update adventure_works air_pollution atherosclerosis baseball cora consumer_expenditures dodgers formula1 imdb interstate94 loans movie_lens occupancy online_retail robot seznam sfscores stats

update fastprop_benchmark: air_pollution dodgers interstate94 occupancy robot

added Dockerfile, docker-compose.yml, and README.md on how to use them

added requirements.txt for usage with python3.8 and python3.11

rename propositionalization to fastprop_benchmark

move binder into own project

adjusting colors
@Urfoex Urfoex force-pushed the rework_update_repository_1_4_0 branch from f208b78 to 13d2fd2 Compare February 15, 2024 14:18
@Urfoex Urfoex marked this pull request as ready for review February 15, 2024 14:23
@Urfoex Urfoex changed the title [WIP] Rework update repository to getML version 1.4.0 Rework update repository to getML version 1.4.0 Feb 15, 2024
Urfoex added a commit that referenced this pull request Feb 20, 2024
…2 #33 #34 #36 #37 #38 #39 #40 Updated and cleaned-up notebooks

using woodwork-types to fit getml and table description

update adventure_works air_pollution atherosclerosis baseball cora consumer_expenditures dodgers formula1 imdb interstate94 loans movie_lens occupancy online_retail robot seznam sfscores stats

update fastprop_benchmark: air_pollution dodgers interstate94 occupancy robot

added Dockerfile, docker-compose.yml, and README.md on how to use them

added requirements.txt for usage with python3.8 and python3.11

rename propositionalization to fastprop_benchmark

move binder into own project

adjusting colors

adjust readme.md and welcome.md to reflect changes repository
@Urfoex Urfoex force-pushed the rework_update_repository_1_4_0 branch from 13d2fd2 to 7f26ae2 Compare February 20, 2024 00:10
@Urfoex Urfoex mentioned this pull request Feb 20, 2024
Urfoex added a commit that referenced this pull request Feb 21, 2024
…2 #33 #34 #36 #37 #38 #39 #40 Updated and cleaned-up notebooks

using woodwork-types to fit getml and table description

update adventure_works air_pollution atherosclerosis baseball cora consumer_expenditures dodgers formula1 imdb interstate94 loans movie_lens occupancy online_retail robot seznam sfscores stats

update fastprop_benchmark: air_pollution dodgers interstate94 occupancy robot

added Dockerfile, docker-compose.yml, and README.md on how to use them

added requirements.txt for usage with python3.8 and python3.11

rename propositionalization to fastprop_benchmark

move binder into own project

adjusting colors

adjust readme.md and welcome.md to reflect changes repository

simplify dockerfile

add basic binder requirements
@Urfoex Urfoex force-pushed the rework_update_repository_1_4_0 branch from 7f26ae2 to aae50de Compare February 21, 2024 17:39
&4 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #36 #37 #38 #39 #40 related issues

using woodwork-types to fit getml and table description

update adventure_works air_pollution atherosclerosis baseball cora consumer_expenditures dodgers formula1 imdb interstate94 loans movie_lens occupancy online_retail robot seznam sfscores stats

update fastprop_benchmark: air_pollution dodgers interstate94 occupancy robot

added Dockerfile, docker-compose.yml, and README.md on how to use them

added requirements.txt for usage with python3.8 and python3.11

rename propositionalization to fastprop_benchmark

move binder into own project

adjusting colors

adjust readme.md and welcome.md to reflect changes repository

simplify dockerfile

add basic binder requirements
@Urfoex Urfoex force-pushed the rework_update_repository_1_4_0 branch from aae50de to 11a4d31 Compare February 21, 2024 17:52
Copy link
Contributor

@liuzicheng1987 liuzicheng1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you so much for the effort!

@srnnkls
Copy link
Collaborator

srnnkls commented Feb 29, 2024

Thank you, this looks great.

I have some questions,but as there are so many changes and I’m on mobile I cannot comment inline, sorry.

  • why have you introduced two runtimes? We have discussed using 3.8 and 3.11 for testing purposes but I think we don’t need both runtimes to be public here
  • you are using manylinux as the baseimage for the runtimes (3.8 & 3.11), why? I think this is a huge image and not necessary for a getml runtime

@Urfoex
Copy link
Contributor Author

Urfoex commented Feb 29, 2024

@srnnkls

why have you introduced two runtimes? We have focused using 3.8 and 3.11 for testing purposes but I think we don’t need both runtimes to be public here

3.8 is the minimum version, and 3.11 the current maximum. 3.11 supports a few newer libraries. And with upcoming changes, support for newer versions, we can quickly start and test them.
It is not necessary to put both. But also the files and code for both are very few.
And when we need to quickly check something, they are there and prepared.

you are using manylknux as the baseimage for the runtimes (3.8 & 3.11), why? I think this is a huge image and not necessary for a getml runtime

That is about 460MB. (https://quay.io/repository/pypa/manylinux2014_x86_64?tab=tags&tag=latest)
The standard Python image is about 360MB. (https://hub.docker.com/_/python/tags)

python:-slim

This image does not contain the common Debian packages contained in the default tag and only contains the minimal Debian packages needed to run python. Unless you are working in an environment where only the python image will be deployed and you have space constraints, we highly recommend using the default image of this repository.

When using this image pip install will work if a suitable built distribution is available for the Python distribution package being installed. pip install may fail when installing a Python distribution package from a source distribution. This image does not contain the Debian packages required to compile extension modules written in other languages.

There is also Jupyter-Lab, getML engne, getML monitor running inside. I would need to try, if they still work in the slim one. (A quick runs in problems with missing curl and can't apt install it… And pysimdjson fails with error: command 'gcc' failed: No such file or directory)
But are ~500MB really that much for just quickly checking things out?

Okay, on disk it is more like 1.56 GB.
And the build notebook images (with installed requirements and things) are 4.53 GB
(The ones build from Python-images are 3.85 GB and 4 GB - not too much different, I think.)

Because of PEP513... PEP600... , manylinux sounded like a good base.

@alxn4 alxn4 merged commit aff38de into 1.4.0 Mar 2, 2024
@srnnkls
Copy link
Collaborator

srnnkls commented Mar 2, 2024

Why the merge? I think there were some points still open.

Also, I think we agreed that a merge is carried out only by PRs‘ owners after approval. Please guys, refrain from merging other’s PRs.

@alxn4
Copy link
Member

alxn4 commented Mar 2, 2024

Let's turn that last open question into an issue and sorry for pushing, but I don't see how this review of a rather small issue will be concluded in time for this Mondays demo. Having the main branch still on version 1.3.2, a commit from over a year ago, gives an impression that does not do justice to all the work we have put into this. And of course, I agree on the matter of the issue: we do not necessarily need two separate docker files, especially versions that diverge with the files in getml-community, and on top that only run on linux/x64.

@Urfoex Urfoex deleted the rework_update_repository_1_4_0 branch March 4, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants