Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Python 3.11 wheels to the CI builds. #1294

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kleschenko
Copy link

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-682020: Python 3.11 wheels #1289

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    This PR adds Python 3.11 to the CI builds. The pyarrow wheels are missing at the moment, and we'll have to wait until it gets resolved

@github-actions
Copy link

github-actions bot commented Oct 25, 2022

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@idanmiara
Copy link

idanmiara commented Oct 25, 2022

Thanks @kleschenko,
I've opened a ticked for pyarrow Python 3.11 wheels:
https://issues.apache.org/jira/browse/ARROW-18154
Edit: Oh I see it was already done before, so I'll delete my issue...

@kleschenko
Copy link
Author

@idanmiara it is also possible to build the 3.11 wheels locally by adding required arrow dependencies to the docker image if you don't want to wait for the official release and try it out earlier. Here is a patch which can be applied on top of this PR to make it work:

diff --git a/ci/docker/connector_build/Dockerfile b/ci/docker/connector_build/Dockerfile
index e895d1a..40dfd31 100644
--- a/ci/docker/connector_build/Dockerfile
+++ b/ci/docker/connector_build/Dockerfile
@@ -16,4 +16,9 @@ RUN git clone https://github.com/matthew-brett/multibuild.git && cd /home/user/m
 
 ENV PATH="${PATH}:/opt/python/cp37-cp37m/bin:/opt/python/cp38-cp38/bin:/opt/python/cp39-cp39/bin:/opt/python/cp310-cp310/bin:/opt/python/cp311-cp311/bin"
 
+RUN yum install -y epel-release || yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1).noarch.rpm
+RUN yum install -y https://apache.jfrog.io/artifactory/arrow/centos/$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1)/apache-arrow-release-latest.rpm
+RUN yum install -y --enablerepo=epel arrow-python-devel
+RUN yum install -y --enablerepo=epel arrow-dataset-devel-8.0.0 parquet-devel-8.0.0 arrow-glib-devel-8.0.0 arrow-devel-8.0.0
+
 ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
diff --git a/setup.py b/setup.py
index d4ccdb7..1be8ea9 100644
--- a/setup.py
+++ b/setup.py
@@ -153,9 +153,7 @@ if _ABLE_TO_COMPILE_EXTENSIONS:
             build_ext.build_extension(self, ext)
 
         def _get_arrow_lib_dir(self):
-            if "SF_ARROW_LIBDIR" in os.environ:
-                return os.environ["SF_ARROW_LIBDIR"]
-            return pyarrow.get_library_dirs()[0]
+            return "/usr/lib64"
 
         def _copy_arrow_lib(self):
             libs_to_bundle = self.arrow_libs_to_copy[sys.platform]

@kleschenko
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@potiuk
Copy link
Contributor

potiuk commented Oct 26, 2022

We are looking forward to this one being merged in Apache Airflow -> Apache Beam is one of the blocking factors to make Airflow work for Py3.11 and I am trying to make all the oss projects that we consided as friends :) a concerted effort to make Py3.11 support works - as Py 3.11 brings mainly huge improvements in performance that our users are eager to start using !

We track it in apache/airflow#27264

If there is any help needed - happy to help also by talking to some dependencies of yours (which are likely also Airflow depenendencies). Good luck with it :)

@potiuk potiuk mentioned this pull request Oct 26, 2022
11 tasks
potiuk added a commit to apache/airflow that referenced this pull request Oct 26, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@sfc-gh-aling
Copy link
Collaborator

hey folks, really appreciate your contributions to supporting Python 3.11. I fully understand the necessity and charm of supporting the newer version -- myself is also impressed by the performance improvement.

However, there's a lot more to work on behind the scenes other than the changes that should happen in this repo:

    1. we need to wait for dependency libraries to support Python 3.11 (e.g., pyarrow), we prefer not to hacking into code until the official support.
    1. we need to set up our internal CI/CD to be able to build and test Python 3.11 wheels.
    1. we have other libraries depending on the connector, we need to ensure Python 3.11 works for our downstream libraries as well.

depending on the factor that currently we have limited bandwidth and we're kinda blocked by (1), we decided to postpone the support to early next year, but we will keep monitoring our backlogs and prioritize dynamically, thanks for your patience!

If you'd like to use the connector in Python 3.11, you could build and install the connector locally (thanks @ kleschenko for sharing the scripts)

@potiuk
Copy link
Contributor

potiuk commented Oct 26, 2022

hey folks, really appreciate your contributions to supporting Python 3.11. I fully understand the necessity and charm of supporting the newer version -- myself is also impressed by the performance improvement.

However, there's a lot more to work on behind the scenes other than the changes that should happen in this repo:

    1. we need to wait for dependency libraries to support Python 3.11 (e.g., pyarrow), we prefer not to hacking into code until the official support.

FYI. PyArrow support is likely to land in a day or two - they are actively waiting for it and made a lot of progress today. I follow up closely and there were many comments and build runs today.

depending on the factor that currently we have limited bandwidth and we're kinda blocked by (1), we decided to postpone the support to early next year, but we will keep monitoring our backlogs and prioritize dynamically, thanks for your patience!

Sure, no prolem - we are unlikely to get all dependencies in check. But this is ok for us. We implemented in Airflow the whole modification of our pipeline so tha twe can disable selectively providers per version - so we can release Airlfow in 3.11 -compliant version without the possibilty of running Snowflkae (Snowflake provider will be < 3.11. So this does not block us, it's more that eager users will not be able to use Snowflake. Not a big problem for us, but it's good to keep that issue open, so that whenever snwflake is ready, we can re-enable 3.11 for it.

potiuk added a commit to apache/airflow that referenced this pull request Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this pull request Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@joshuataylor
Copy link

Agreed, should wait for all upstream to be fully supported until official support. Maybe once they are supported, until there is full support for wheels in this package there could be notes about how to build it? I think it'll throw an error about wheels otherwise if you are on 3.11.

potiuk added a commit to apache/airflow that referenced this pull request Oct 31, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@maciejfic
Copy link

Hello, do you have any update on Python 3.11 wheel? Thanks in advance for any info!

@joshuataylor
Copy link

upstream don't have wheels yet, so won't be here for a while. apache arrow is having a discussion to release a version just for the wheels for python 3.11.

@idanmiara
Copy link

upstream don't have wheels yet, so won't be here for a while. apache arrow is having a discussion to release a version just for the wheels for python 3.11.

apache/arrow#14499 (comment)
It seems that nightly wheels for pyarrow for 3.11 are already available via the nightly repo.

@joshuataylor
Copy link

joshuataylor commented Nov 10, 2022

yes, but obviously given the previous comments here, we probably won't see an official wheel compatible version of snowflake-connector-python. But will be easy to build once all upstream is official 🚀

@potiuk
Copy link
Contributor

potiuk commented Nov 10, 2022

Yeah . I am also holding off in Airflow with testing until at least RC candidate is out in PyPI - it's far easier to run our test pipeline if dependencies are in PyPI already. From recent discussion at the Arrow devlist they are going to release 10.0.1 soon (also because of a bug-fix needed by Open-Telemetry) and it will have Python 3.11 binaries. 🤞

@idanmiara
Copy link

Yeah . I am also holding off in Airflow with testing until at least RC candidate is out in PyPI - it's far easier to run our test pipeline if dependencies are in PyPI already. From recent discussion at the Arrow devlist they are going to release 10.0.1 soon (also because of a bug-fix needed by Open-Telemetry) and it will have Python 3.11 binaries. 🤞

Thanks for this update! Here are the relevant discussions for whoever wants to follow:
https://lists.apache.org/list?dev@arrow.apache.org:lte=1M:[DISCUSS]%20Pyarrow%20wheels%20for%20Python%203.11
https://lists.apache.org/list?dev@arrow.apache.org:lte=1M:Open-Telemetry

@potiuk
Copy link
Contributor

potiuk commented Nov 23, 2022

FYI. Pyarrow yesterday released binary wheels compatible with 3.11 - (10.0.1) - so this should not be a blocker any more.

@joshuataylor
Copy link

Awesome! I guess to support 10.0.1, the version will need to be bumped here?

potiuk added a commit to apache/airflow that referenced this pull request Nov 24, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@noamcohen97
Copy link
Contributor

created #1349 to bump PyArrow version and make the required adjustments

potiuk added a commit to potiuk/airflow that referenced this pull request Jan 19, 2023
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: apache#27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@ad-m-ss
Copy link

ad-m-ss commented Feb 23, 2023

@kleschenko could you rebase? Looks like a lot move forward around pyarrow, so it's worth trying out what new blockers we have here.

@surfaceowl
Copy link

@kleschenko -- floating this towards the top of your queue...
@sfc-gh-aling -- would the team be able to share a brief update on the three groups of blockers from your 26 Oct 2022 reply? SNOW has made recent commits on arrow, but am not sure if those issues are directly linked to this one...

    -   i.  need to wait for dependency libraries to support Python 3.11 (e.g., pyarrow)
    -  ii.  need to set up our internal CI/CD to be able to build and test Python 3.11 wheels.
    -  iii. need to ensure Python 3.11 works for our downstream libraries as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SNOW-682020: Python 3.11 wheels
10 participants