Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 breaks gcp_api #108

Open
tkaymak opened this issue Aug 9, 2017 · 7 comments
Open

Python 3 breaks gcp_api #108

tkaymak opened this issue Aug 9, 2017 · 7 comments

Comments

@tkaymak
Copy link

tkaymak commented Aug 9, 2017

Since this docker project switched to Python 3, the new airflow 1.8.2 will no longer built as setup.py requires google-cloud-dataflow when selecting gcp_api, which is not available for Python 3 yet.

Moreover, even after taking the mentioned package out of setup.py, there seems to be a problem with the snakebite package, that is used by the BigQuerySensor

[2017-08-09 07:55:08,676] {models.py:283} ERROR - Failed to import: /usr/local/airflow/dags/dag.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 280, in process_file
    m = imp.load_source(mod_name, filepath)
  File "/usr/local/lib/python3.6/imp.py", line 172, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 675, in _load
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed
  File "/usr/local/airflow/dags/dag.py", line 7, in <module>
    from airflow.contrib.sensors.bigquery_sensor import BigQueryTableSensor
  File "/usr/local/lib/python3.6/site-packages/airflow/contrib/sensors/bigquery_sensor.py", line 17, in <module>
    from airflow.operators.sensors import BaseSensorOperator
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/sensors.py", line 32, in <module>
    from airflow.hooks.hdfs_hook import HDFSHook
  File "/usr/local/lib/python3.6/site-packages/airflow/hooks/hdfs_hook.py", line 20, in <module>
    from snakebite.client import Client, HAClient, Namenode, AutoConfigClient
  File "/usr/local/lib/python3.6/site-packages/snakebite/client.py", line 1473
    baseTime = min(time * (1L << retries), cap);
                            ^
SyntaxError: invalid syntax
@tkaymak
Copy link
Author

tkaymak commented Aug 9, 2017

I've successfully tested that it works with 2.7-stretch

@ghost
Copy link

ghost commented Aug 28, 2017

The snakebite error is the same as in issue #77

Why not vote for this so Google upgrade the cloud dataflow api to python 3 https://googlecloudplatform.uservoice.com/forums/302628-cloud-dataflow/suggestions/31055887-python-3x-support

@88sanjay
Copy link

88sanjay commented Oct 5, 2017

I had to install a couple of more libraries to get the bigquery hook to work.

@ayoungprogrammer
Copy link

ayoungprogrammer commented Jan 20, 2018

I was able to get Airflow working on Python2 with the following changes:

change base image to FROM python:2-stretch
also change dependency installations to:

RUN set -ex \
    && buildDeps=' \
        python-dev \
        libkrb5-dev \
        libsasl2-dev \
        libssl-dev \
        libffi-dev \
        build-essential \
        libblas-dev \
        liblapack-dev \
        libpq-dev \
        git \
    ' \
    && apt-get update -yqq \
    && apt-get install -yqq --no-install-recommends \
        $buildDeps \
        python-pip \
        python-requests \
        apt-utils \
        curl \
        rsync \
        netcat \
        locales \
    && sed -i 's/^# en_US.UTF-8 UTF-8$/en_US.UTF-8 UTF-8/g' /etc/locale.gen \
    && locale-gen \
    && update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
    && useradd -ms /bin/bash -d ${AIRFLOW_HOME} airflow \
    && python2 -m pip install -U pip setuptools wheel \
    && pip2 install Cython \
    && pip2 install six \
    && pip2 install pytz \
    && pip2 install pyOpenSSL \
    && pip2 install ndg-httpsclient \
    && pip2 install pyasn1 \
    && pip2 install apache-airflow[crypto,celery,postgres,hive,jdbc,slack,gcp_api]==$AIRFLOW_VERSION \
    && pip2 install celery[redis]==4.0.2 \
    && apt-get purge --auto-remove -yqq $buildDeps \
    && apt-get clean \
    && rm -rf \
        /var/lib/apt/lists/* \
        /tmp/* \
        /var/tmp/* \
        /usr/share/man \
        /usr/share/doc \
        /usr/share/doc-base

For running python3 scripts, I used a BashOperator to run python3 via conda

@Jonathan34
Copy link

this also breaks some other contrib operators like docker, dataflow...

@hden
Copy link

hden commented May 23, 2018

Solved by apache/airflow#3273.
Looking forward for the next release.

@agiratech-gopal
Copy link

Using Airflow 10.8 with Python 3.7 solves issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants