Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Airflow in Docker - Tutorial dags backfill tutorial example fails with TypeError: cannot serialize '_io.TextIOWrapper' object #14379

Closed
DougForrest opened this issue Feb 23, 2021 · 8 comments
Labels
kind:bug This is a clearly a bug

Comments

@DougForrest
Copy link

Hello, Apologizes in advance if this is a newbie mistake. I'm working through the tutorial using Docker Desktop for Mac locally using the docker-compose from the Running Airflow in Docker. I'm copying the tutorial code as is as except for replacing airflow with ./airflow.sh to run in docker. All the commands work as expected except for the backfill example, which fails with TypeError: cannot serialize '_io.TextIOWrapper' object. Please advise.

# start your backfill on a date range
 % ./airflow.sh dags backfill tutorial \
    --start-date 2015-06-01 \
    --end-date 2015-06-07
Creating airflow_tut_airflow-worker_run ... done
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/dag_command.py:62 PendingDeprecationWarning: --ignore-first-depends-on-past is deprecated as the value is always set to True
[2021-02-23 08:11:03,363] {dagbag.py:448} INFO - Filling up the DagBag from /opt/airflow/dags
[2021-02-23 08:11:04,308] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-01T00:00:00+00:00', '--ignore-depends-on-past', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,343] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,375] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-03T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,406] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-04T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,446] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-05T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,487] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-06T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,573] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-07T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__main__.py", line 40, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 89, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/dag_command.py", line 116, in dag_backfill
    run_backwards=args.run_backwards,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/dag.py", line 1706, in run
    job.run()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 237, in run
    self._execute()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 65, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 805, in _execute
    session=session,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 727, in _execute_for_run_dates
    session=session,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 602, in _process_backfill_task_instances
    executor.heartbeat()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 158, in heartbeat
    self.trigger_tasks(open_slots)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 263, in trigger_tasks
    self._process_tasks(task_tuples_to_send)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 272, in _process_tasks
    key_and_async_results = self._send_tasks_to_celery(task_tuples_to_send)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 332, in _send_tasks_to_celery
    send_task_to_executor, task_tuples_to_send, chunksize=chunksize
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/local/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot serialize '_io.TextIOWrapper' object
 % docker --version
Docker version 20.10.2, build 2291f61
% docker-compose --version
docker-compose version 1.27.4, build 40524192
% docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                    PORTS                              NAMES
cfb65ccd5432   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes             8080/tcp                           airflow_tut_airflow-scheduler_1
46b70dfa6c90   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes             8080/tcp                           airflow_tut_airflow-worker_1
37065f3e3d9b   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes (healthy)   0.0.0.0:5555->5555/tcp, 8080/tcp   airflow_tut_flower_1
e498390f3c50   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes (healthy)   0.0.0.0:8080->8080/tcp             airflow_tut_airflow-webserver_1
5844644d1157   postgres:13            "docker-entrypoint.s…"   23 minutes ago   Up 22 minutes (healthy)   5432/tcp                           airflow_tut_postgres_1
929c10f40745   redis:latest           "docker-entrypoint.s…"   23 minutes ago   Up 22 minutes (healthy)   0.0.0.0:6379->6379/tcp             airflow_tut_redis_1
@DougForrest DougForrest added the kind:bug This is a clearly a bug label Feb 23, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Feb 23, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@DougForrest
Copy link
Author

I updated the docker-compose.yaml default airflow image which fixed the problem.

-  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
+   image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:master-python3.8}

Output above shows container was running python3.6. The comments in the docker-compose.yaml state the default should be Default: apache/airflow:master-python3.8

Created pull request for v2-0-stable. Changes are consistent with master branch.
#14404

@kaxil
Copy link
Member

kaxil commented Feb 23, 2021

Master and v2-0-stable are fair bit different in terms of the actual code. And users shouldn't be using images from Master in their Prod clusters

@DougForrest
Copy link
Author

Thanks for the response I wasn't aware there was a large difference. I can confirm that the image apache/airflow:2.0.1-python3.8 also works in my local environment.

@kaxil
Copy link
Member

kaxil commented Feb 23, 2021

Awesome, glad that it worked :)

@kaxil kaxil closed this as completed Feb 23, 2021
@yibinlin
Copy link

+1 I am having this problem. Glad to see a solution here.

@etadelta222
Copy link

etadelta222 commented Aug 13, 2021

+1 Ran into the same issue when doing the tutorial using the docker install. Updated docker-compose and added ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.2-python3.8} and it ran successfully.

@potiuk
Copy link
Member

potiuk commented Aug 13, 2021

+1 Ran into the same issue when doing the tutorial using the docker install. Updated docker-compose and added ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.2-python3.8} and it ran successfully.

So just to double-check - you were also running on Windows 10 and apache/airflow:2.1.2 caused the same problem @etadelta222?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

5 participants