New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[airflow] module not found #2305
Comments
Hi @danielc103, note that the official Airflow image does not include this module either. In their official documentation they also explain why: https://airflow.apache.org/docs/stable/installation.html We believe this could be solved by allowing installation of extra Airflow packages, instead of adding all of these (with the consequent image size increase). Find the related open issue here: https://github.com/bitnami/bitnami-docker-airflow/issues/32. |
I've rebuilt bitnami image with oracle client installed. I'm still getting module not found errors. In this case it's nested packages.
Broken DAG: [/opt/bitnami/airflow/dags/git/mydag.py] No module named 'acme' or 'cx_orcale' If I move |
it's odd even deleting the dags from the git folder and moving them to root
there's no dags in /git folder |
@danielc103 Could you let us know how you are installing the
Have you taken a look at this PR? https://github.com/bitnami/bitnami-docker-airflow/pull/33/files You could check it out, by creating/mounting the file |
Yep found that out after digging around. Install in COPY requirements.txt requirements.txt
RUN . /opt/bitnami/airflow/venv/bin/activate && pip install -r requirements.txt I did not run deactivate command after. I am rebuilding image right now to test. Would that also affect the nested acme module? |
even adding the deactivate after the requirements install I still get the same errors.
|
@danielc103 Could you share the requirements.txt file? Also, it would be great if you could provide us with the DAG you're using, or a similar one where similar issues happen. That way we can try to reproduce your issues locally. P.S. Just checking, I assume the "acme" module is listed in "requirements.txt", right? |
requirements.txt is just
Similar to this folder structure https://github.com/gtoonstra/etl-with-airflow/tree/master/examples/etl-example/dags. This dag is at from __future__ import print_function
import airflow
from datetime import datetime, timedelta
from acme.operators.dwh_operators import PostgresOperatorWithTemplatedParams
from acme.operators.dwh_operators import AuditOperator
from airflow.models import Variable
args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(7),
'provide_context': True
}
tmpl_search_path = Variable.get("sql_path")
dag = airflow.DAG(
'customer_clear',
schedule_interval="@once",
dagrun_timeout=timedelta(minutes=60),
template_searchpath=tmpl_search_path,
default_args=args,
max_active_runs=1)
get_auditid = AuditOperator(
task_id='get_audit_id',
postgres_conn_id='postgres_dwh',
audit_key="customer",
cycle_dtm="{{ ts }}",
dag=dag,
pool='postgres_dwh')
clear_customer = PostgresOperatorWithTemplatedParams(
sql='TRUNCATE staging.customer CASCADE',
postgres_conn_id='postgres_dwh',
task_id='clear_customer',
dag=dag,
pool='postgres_dwh')
get_auditid >> clear_customer
if __name__ == "__main__":
dag.cli() |
I've deployed on minishift locally and imported the following dags through git https://github.com/gtoonstra/etl-with-airflow/tree/master/examples/etl-example/dags myvalues.yaml airflow:
cloneDagFilesFromGit:
enabled: true
repository: https://github.com/gtoonstra/etl-with-airflow
branch: master
path: examples/etl-example/dags/
securityContext:
enabled: false
postgresql:
volumePermissions:
enabled: false
shmVolume:
chmod:
enabled: false
securityContext:
enabled: false
redis:
securityContext:
enabled: false
and get the same acme not found errors. |
@danielc103 It looks to be working fine for us. However, if we create a new "git" directory and move all files/folders inside there, we get a similar error:
In order to fix that we had to do a couple of things:
Hope it works! |
yup, that was it. thank you very much for holding my hand through that. lol. Can I pr for documentation on explaining the nested module? |
Please do! It looks like it is already explained how to clone DAGs from a Git repo, so we should probably mention that any cloned directory should contain the init.py file when loading nested modules.
I guess it makes sense, we just need make sure it is compatible with mounting DAGs via config maps. |
Which chart:
[airflow] version 4.4.3
Describe the bug
DAG requiring cx_oracle module throwing error cannot be found. needed to manually pip install cx_oracle
ERROR
To Reproduce
Expected behavior
cx_oracle module to be found
Additional context
This very well may be a lack of my understanding of Airflow. This python module should be used by oracle hooks from airflow as well as added modules like what I'm using.
https://airflow.apache.org/docs/stable/_modules/airflow/hooks/oracle_hook.html
The text was updated successfully, but these errors were encountered: