Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git sync for plugins in the Helm Chart #11708

Open
mik-laj opened this issue Oct 21, 2020 · 19 comments
Open

Git sync for plugins in the Helm Chart #11708

mik-laj opened this issue Oct 21, 2020 · 19 comments
Labels
area:helm-chart Airflow Helm Chart good first issue kind:feature Feature Requests

Comments

@mik-laj
Copy link
Member

mik-laj commented Oct 21, 2020

Hello,

Helm Chart currently supports DAG sync, but some users have the need to sync plugins in a similar way as well.

It would be fantastic if Helm Chart would make it easy to set up sync ~/airflow/plugins from a separate repository as well.

Best regards,
Kamil Breguła

@mik-laj mik-laj added kind:feature Feature Requests good first issue area:helm-chart Airflow Helm Chart labels Oct 21, 2020
@mik-laj mik-laj changed the title Git sync for plugins Git sync for plugins in the Helm Chart Oct 21, 2020
@DevanshuKoyalkar
Copy link

Hi mik-laj, I'll take this up

@mik-laj
Copy link
Member Author

mik-laj commented Oct 27, 2020

Assigned.

We are migrating to the new unit test framework, so you will have to wait a while before you can write tests ... or if you have the patience you can also write in the old framework, but this is very limiting and frustrating.

@DevanshuKoyalkar
Copy link

I am getting a sense of the problem statement. I should be adding two new options like dags.persistence.enabled_ and dags.gitSync.enabled for plugins which do the similar job right?
Is there an equivalent of airflow_dags_mount_path for plugins?

@mik-laj
Copy link
Member Author

mik-laj commented Oct 29, 2020

I should be adding two new options like dags.persistence.enabled_ and dags.gitSync.enabled for plugins which do the similar job right?

Yes. It would be good to support the other configuration options for gitSync as well.

Is there an equivalent of airflow_dags_mount_path for plugins?

By default, $AIRFLOW_HOME/plugins.
See also:
https://airflow.readthedocs.io/en/latest/plugins.html

airflowHome: "/opt/airflow"

https://airflow.readthedocs.io/en/latest/configurations-ref.html#plugins-folder

@mik-laj
Copy link
Member Author

mik-laj commented Oct 30, 2020

We have already finished the migration to the new framework. Nothing is blocking anymore to work on this change.

@dollschasingmen
Copy link

any progress on this?

@thesuperzapper
Copy link
Contributor

The user-community chart (not the official) has implemented support for using plugins from within your DAGs git-repo. (See PR airflow-helm/charts#390)

You only need to correctly set your AIRFLOW__CORE__PLUGINS_FOLDER to be a subfolder of your dag folder, which will probably be something like /opt/airflow/dags/repo/MY_PLUGINS_FOLDER_IN_REPO.

@EliMor
Copy link

EliMor commented Aug 19, 2021

@praktiskt
Copy link

praktiskt commented Sep 23, 2021

It is possible to mount the plugins via git sync today, if you keep your plugins with your DAGs, e.g.;

# Your dag repo
dags/...
plugins/...

You can then just set an extraEnv with the direct path.

extraEnv:
    - name: "AIRFLOW__CORE__PLUGINS_FOLDER"
      value: "/opt/airflow/dags/repo/plugins"

❗ Edit: You probably don't want to do this. See the excellent explanation from potiuk below ⬇️

@potiuk
Copy link
Member

potiuk commented Sep 23, 2021

This is a good idea to have GitSync for plugins, but I am afraid the solution where you keep your plugins together with DAGs is a very bad idea for Airflow 2, one that should be heavily discouraged.

There are two reasons (actually the root cause is isolation and security):

  1. Plugins should not be "modifiable" by DAG writers. Plugins give you more "powers" over Airlfow that DAGs. The Plugins can modify UI of airflow (similarly as providers - with connections) and they are executed in the context of scheduler (not in the context of FileProcessor as all DAGs are) and you should not have the same people being able to write DAGs and modify plugins. This is something that wil be only strenghtened in the future when we increase isolation of Ailrflow components and it's a prerequisite to future Airflow multi-tenancy.

But more importantly:

  1. In Airflow 2 DAGs are not available as mounted volume for Webserver in our Helm Chart (and they should NOT be). See the Official Helm Chart. This is precisely because of the increased isolation and the fact that webserver does not need them any more (it uses Serialized form of the DAGs from the DB). So if your Plugins modify the UI of Airflow, this solution will not work (unless you also additionally mount the DAG folder to webserver - but this violates the isolation that was improved in Airflow 2 - see point 1).

Generally speaking - you SHOULD separate DAGs and Plugins. Access to modigy those two in many cases should be controlled differently for security reasons.

@tomrutter
Copy link
Contributor

Is this still on the roadmap? It seems from the above that the cleanest solution would be to replicate the current git-sync setup used for dags (in the scheduler or dag processor) for a "plugins" repo in scheduler/workers/webserver?

@potiuk
Copy link
Member

potiuk commented Oct 27, 2022

Just to set expectations and explain how it works here - @tomrutter - as everything in open-source, things get done when somene does it :). There is no "magical" management and roadmap planning that you can have single authoritative response on.

There is no roadmap unless someone decides to take on responsibility and manage, prepare and follow up the roadmap. That someone can be you if you want to get things implemented (which means often implementing things yourself and providing a PR or finding someone who would volunteer and do it.

You are absolutely free to take a lead on it :)

@potiuk
Copy link
Member

potiuk commented Oct 27, 2022

And just to add a bit - If you look at our repo - Airflow is created by >2200 contrbutors, mostly people like you who wanted something they missed and added it :). Adding such a feature is a nice way not only good for you (because you will be able to use it) but also for others (they will be able to use it too) but also a great way to contribute back for the software you got absolutely for free :)

@tomrutter
Copy link
Contributor

Thanks! I will have a closer look and see if what I've done here could be submitted as PR.

@potiuk
Copy link
Member

potiuk commented Nov 14, 2022

Would be great.

@alff
Copy link

alff commented Jan 4, 2023

hey guys, just curious: is this approach even possible? I mean: sync plugins with git-sync. I had impression that 'plugins' must be in place before airflow starts. Otherwise if you start airflow with empty folder and then try to put new modules - it returns exception 'plugin is not registered'. Correct me if i'm wrong.

@potiuk
Copy link
Member

potiuk commented Jan 4, 2023

hey guys, just curious: is this approach even possible? I mean: sync plugins with git-sync. I had impression that 'plugins' must be in place before airflow starts. Otherwise if you start airflow with empty folder and then try to put new modules - it returns exception 'plugin is not registered'. Correct me if i'm wrong.

There are various scenarios and cases when the plugins can be reloaded: https://airflow.apache.org/docs/apache-airflow/stable/plugins.html#when-are-plugins-re-loaded

@quocanh-upmesh
Copy link

any progress on this? can anyone tell me how can i sync the plugins from github? Appreciate your support. Thank

@potiuk
Copy link
Member

potiuk commented Apr 4, 2023

There are various scenarios and cases when the plugins can be reloaded: https://airflow.apache.org/docs/apache-airflow/stable/plugins.html#when-are-plugins-re-loaded

From what I see @quocanh-upmesh (Also see the docs above) nothing changed. As explained above - things get implemented here when somoene implements it (and the most certain way to get somethig implemented is to either implement it or find someone who will do it).

If you look above this would not only require change in the Helm Chart to sync the dags but also likely restarting the right components when they change (depending on the type of plugin and component). And answering your question - apparently there is no progress, otherwise semeone who would work on it would likely mention it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:helm-chart Airflow Helm Chart good first issue kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests