Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate provider-specific executor docs to providers #34809

Closed
wants to merge 7 commits into from

Conversation

RNHTTR
Copy link
Collaborator

@RNHTTR RNHTTR commented Oct 6, 2023

Closes: #33916

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 6, 2023

It's currently unclear why I'm able to reference :doc:`apache-airflow:<some document in core docs>` , but I can't seem to link, for example, :doc:`apache-airflow-providers-celery:celery_executor`

I'd appreciate if any reStructuredTexperts™️ / Sphinx experts know how to link from the apache-airflow docs to a provider doc :D

@eladkal
Copy link
Contributor

eladkal commented Oct 7, 2023

It's currently unclear why I'm able to reference :doc:`apache-airflow:<some document in core docs>`, but I can't seem to link, for example, :doc:`apache-airflow-providers-celery:celery_executor`

I'd appreciate if any reStructuredTexperts™️ / Sphinx experts know how to link from the apache-airflow docs to a provider doc :D

If not mistaken we have example in Notifer doc that links core to provider docs

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 7, 2023

Yeah, I see lots of references from apache-airflow to apache-airflow-providers (for example, this one), but when I try to do :doc:`apache-airflow-providers-celery:celery_executor` , I get the following error during the build:

------------------------------ Error   5 --------------------
 WARNING: unknown document: 'CeleryExecutor apache-airflow-providers-celery:celery_executor'

File path: apache-airflow/core-concepts/executor/index.rst (70)

  65 | The DebugExecutor also exists but has been deprecated in favor of using the ``dag.test()`` command for :doc:`debugging <../debug>`.
  66 |
  67 |
  68 | **Remote Executors**
  69 |
> 70 | * :doc:`CeleryExecutor apache-airflow-providers-celery:celery_executor`
  71 | * :doc:`CeleryKubernetesExecutor apache-airflow-providers-celery:celery_kubernetes_executor`
  72 | * :doc:`DaskExecutor apache-airflow-providers-daskexecutor:dask_executor`
  73 | * :doc:`KubernetesExecutor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`
  74 | * :doc:`KubernetesLocalExecutor <apache-airflow-providers-cncf-kubernetes:local_kubernetes_executor>`
  75 |

celery_executor.rst now lives at docs/apache-airflow-providers-celery/celery_executor.rst.

@potiuk
Copy link
Member

potiuk commented Oct 16, 2023

Yeah, I see lots of references from apache-airflow to apache-airflow-providers (for example, this one), but when I try to do :doc:`apache-airflow-providers-celery:celery_executor`, I get the following error during the build:

celery_executor.rst now lives at docs/apache-airflow-providers-celery/celery_executor.rst.

yes. Sphinx is picky and requires sometimes a lot of persistence and trying out things looking at other examples.

@RNHTTR RNHTTR marked this pull request as ready for review October 17, 2023 01:55
@RNHTTR RNHTTR requested a review from potiuk as a code owner October 17, 2023 01:55
@RNHTTR RNHTTR requested a review from eladkal October 17, 2023 02:10
@eladkal
Copy link
Contributor

eladkal commented Oct 17, 2023

Seems there are conflicts

@eladkal
Copy link
Contributor

eladkal commented Oct 17, 2023

We need also to edit docs/apache-airflow/redirects.txt

This will allow referencing from old paths to the new paths without getting 404 page

@RNHTTR RNHTTR force-pushed the move-provider-docs-out-of-core branch 2 times, most recently from 52188f2 to 5217a26 Compare October 21, 2023 14:02
@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 21, 2023

Is this just a flaky test? It builds fine locally

@potiuk
Copy link
Member

potiuk commented Oct 21, 2023

Is this just a flaky test? It builds fine locally

Likely one more fix to recently completed docs-build command for selective docs building - let me see.

@potiuk
Copy link
Member

potiuk commented Oct 21, 2023

#35102 should fix it

@potiuk potiuk force-pushed the move-provider-docs-out-of-core branch from 5217a26 to 414735a Compare October 21, 2023 16:24
@potiuk
Copy link
Member

potiuk commented Oct 21, 2023

Rebased. Let's see if the fix works.

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 21, 2023

Thanks. What's the best way for me to retry the failed test?

@eladkal
Copy link
Contributor

eladkal commented Oct 21, 2023

Thanks. What's the best way for me to retry the failed test?

Just did. Lets see

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label Oct 21, 2023
@potiuk potiuk force-pushed the move-provider-docs-out-of-core branch from 414735a to 743d2c7 Compare October 21, 2023 21:40
@potiuk
Copy link
Member

potiuk commented Oct 21, 2023

I've added full tests needed label adn rebased it -> likely the issues are because only subset of docs was built due to selective checks. With full tests needed label, all of them will run.

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 22, 2023

Strange...

breeze build-docs --docs-only

This builds successfully, but if I remove --docs-only, the build fails...

@potiuk
Copy link
Member

potiuk commented Oct 22, 2023

Strange...

breeze build-docs --docs-only

This builds successfully, but if I remove --docs-only, the build fails...

You can use --clean-build flag - likely this one will reveal the problem.

@potiuk
Copy link
Member

potiuk commented Oct 22, 2023

Guessing what could happen here (and those are just guesses looking at the output)

I think the problem is that you have changes in both "airflow" and "celery" and other packages that refer to each other in both directions.

The way how docs building is done is that it builds each package separately and uses the "inventory" from other packages (downloaded from s3) it refers in order to verify if the document exist. Once the package successfully builds the docs, the downloaded inventory gets updated locally.

In most cases where there are two packages and only one of them refer to the other document, this document building has multi-pass implemented. If the package fails to be built it is added to the queue and retried again (up to three times in case the dependencies are A -> B -> C (so if A links to B and B links to C in the first pass C succeeds, in the second B will succeed as it will use locally built inventory from C and in the third A will finaly succeed using the locally built inventory from B.

The problem is that if those packages are build together and they are referring to each other's new documents - none of them can succed - because they are referring to each other's documents, and those documents do not exist in the remote inventory.

Looking at the output - it almost worked:

image

You can see that at first pass airflow + cncf.kubernetes + celery + dask failed docs building

At the second pass the three others succeeded, only airflow was left.

Unfortunately the third pass on Airflow failed.

The error you see:

image

Is that API documents could not be found in Airflow - most likely because the "clean" step deleted it and for some reason they were not recreated by the API plugin. So maybe we can attempt to remove the clean step between the retries.

Suggestion how to fix it (if my guess is right):

Only clean the docs when you start and not when you retry. Might be a good exercise to learn how the build process works.

As i understand it (again by looking at the code - I modified it quite a few times but mostly when I saw similar issues):

In build_docs.py :

The build_docs_for_packages is executed several times to make the multi-pass works

  • The first time it should be called with parameter to run cleanup
  • all the other times retry_building_docs_if_needed it should be called without running cleanup

That could probably solve the problem.

@RNHTTR RNHTTR force-pushed the move-provider-docs-out-of-core branch from 743d2c7 to d8d8e96 Compare October 23, 2023 01:31
@RNHTTR RNHTTR requested a review from kaxil as a code owner October 23, 2023 01:31
@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 23, 2023

That explains why a build would succeed after a handful of failure messages :) Thanks, @potiuk .

I added a param to not perform the clean step if it's on a retried build. Still getting empty spell check errors that I need to investigate.

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Yep. Looks better. I think those are not spellcheck errors but some regular sphinx riddiles. Sphinx does not seem to like SOMETHING in the docs and not tell exactly where. The usual thing which hopefully when we move to #33156 might get better. For now it requires a bit guessing what's wrong, unfortunately.

############################## Start docs build errors summary ##############################

============================== apache-airflow-providers-celery ==============================
------------------------------ Error   1 --------------------
 WARNING: undefined label: 'executor:celeryexecutor'

File path: apache-airflow-providers-celery/_api/airflow/providers/celery/executors/celery_executor/index.rst (4)

   1 | :py:mod:`airflow.providers.celery.executors.celery_executor`
   2 | ============================================================
   3 | 
>  4 | .. py:module:: airflow.providers.celery.executors.celery_executor
   5 | 
   6 | .. autoapi-nested-parse::
   7 | 
   8 |    CeleryExecutor.
   9 | 
============================== apache-airflow-providers-cncf-kubernetes ==============================
------------------------------ Error   1 --------------------
 WARNING: undefined label: 'executor:kubernetesexecutor'

File path: apache-airflow-providers-cncf-kubernetes/_api/airflow/providers/cncf/kubernetes/executors/kubernetes_executor/index.rst (4)

   1 | :py:mod:`airflow.providers.cncf.kubernetes.executors.kubernetes_executor`
   2 | =========================================================================
   3 | 
>  4 | .. py:module:: airflow.providers.cncf.kubernetes.executors.kubernetes_executor
   5 | 
   6 | .. autoapi-nested-parse::
   7 | 
   8 |    KubernetesExecutor.
   9 | 
============================== apache-airflow-providers-daskexecutor ==============================
------------------------------ Error   1 --------------------
 WARNING: undefined label: 'executor:daskexecutor'

File path: apache-airflow-providers-daskexecutor/_api/airflow/providers/daskexecutor/executors/dask_executor/index.rst (4)

   1 | :py:mod:`airflow.providers.daskexecutor.executors.dask_executor`
   2 | ================================================================
   3 | 
>  4 | .. py:module:: airflow.providers.daskexecutor.executors.dask_executor
   5 | 
   6 | .. autoapi-nested-parse::
   7 | 
   8 |    DaskExecutor.
   9 | 

############################## End docs build errors summary ##############################

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

My wild guess is that you have label reference somewhere in the code left (of the form _executor:celeryexecutor).

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Oct 23, 2023

Locally, I get the following ☹️

The documentation has errors.
####################  Packages errors summary  ####################
Package name                                         Count of doc build errors    Count of spelling errors
-------------------------------------------------  ---------------------------  --------------------------
apache-airflow-providers-celery                                     0                           1
apache-airflow-providers-cncf-kubernetes                            0                           1
apache-airflow-providers-daskexecutor                               0                           1
##################################################

I'll keep digging.

@potiuk
Copy link
Member

potiuk commented Oct 23, 2023

Sphinxes often speak in riddles, this is something I often repeat (too often I am afraid)

@potiuk
Copy link
Member

potiuk commented Nov 15, 2023

Ho about completing it :)?

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Nov 16, 2023

Ho about completing it :)?

Been traveling / busy / sick the last few weeks. Hoping to pick this back up next week.

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Dec 9, 2023

@lzdanski This is the spell check sphinx issue I was referring to.

@potiuk
Copy link
Member

potiuk commented Dec 10, 2023

@lzdanski This is the spell check sphinx issue I was referring to.

@RNHTTR - I think first thing to do is to rebase this one. It's quite far behind, we introduced a number of changes (for example we split docs building from spellcheck) - maybe after rebase things will get clearer.

@RNHTTR RNHTTR force-pushed the move-provider-docs-out-of-core branch from d8d8e96 to cfbd26a Compare December 30, 2023 15:57
@eladkal
Copy link
Contributor

eladkal commented Jan 15, 2024

@RNHTTR can you rebase and resolve conflicts? I see the current failure is on openai provider which is odd.

@RNHTTR RNHTTR marked this pull request as draft January 23, 2024 21:33
@RNHTTR RNHTTR force-pushed the move-provider-docs-out-of-core branch from cfbd26a to 0546c48 Compare January 23, 2024 21:36
@potiuk
Copy link
Member

potiuk commented Jan 23, 2024

Fix to failing release Helm job - #36985

@potiuk
Copy link
Member

potiuk commented Jan 24, 2024

Just to comment on the errors @RNHTTR and guide you a bit..

I think all the _api errors are coming from those first few errors here. Usually they appear when the main package failed to build even after multiple attempts. They are really side-effects of the reall problems:

Error 1) in openai -> seems that there is extra file added
Error 1) Error 2) in apache-airflow -> I thin the references are just pointing to the moved docs - they should point to the new location. I think those 2 are preventing the apache-airflow package dogs from being built (and eventually trigger the API errors)

You should track the others :)

============================== apache-airflow-providers-openai ==============================
------------------------------ Error   1 --------------------
/opt/airflow/docs/apache-airflow-providers-openai/dask_executor.rst: WARNING: document isn't included in any toctree

============================== apache-airflow ==============================
------------------------------ Error   1 --------------------
 WARNING: unknown document: '../core-concepts/executor/celery'

File path: apache-airflow/administration-and-deployment/production-deployment.rst (173)

 168 |   scheduler and you cannot upgrade the Scheduler without killing the tasks run by it. You can either
 169 |   pause all your DAGs and wait for the running tasks to complete or just stop the scheduler and kill all
 170 |   the tasks it runs - then you will need to clear and restart those tasks manually after the upgrade
 171 |   is completed (or rely on ``retry`` being set for stopped tasks).
 172 | 
>173 | * For the :doc:`Celery executor <../core-concepts/executor/celery>`, you have to first put your workers in
 174 |   offline mode (usually by setting a single ``TERM`` signal to the workers), wait until the workers
 175 |   finish all the running tasks, and then upgrade the code (for example by replacing the image the workers run
 176 |   in and restart the workers). You can monitor your workers via ``flower`` monitoring tool and see the number
 177 |   of running tasks going down to zero. Once the workers are upgraded, they will be automatically put in online
 178 |   mode and start picking up new tasks. You can then upgrade the ``Scheduler`` in a rolling restart mode.
------------------------------ Error   2 --------------------
 WARNING: unknown document: '../core-concepts/executor/kubernetes'

File path: apache-airflow/administration-and-deployment/production-deployment.rst (180)

 175 |   finish all the running tasks, and then upgrade the code (for example by replacing the image the workers run
 176 |   in and restart the workers). You can monitor your workers via ``flower`` monitoring tool and see the number
 177 |   of running tasks going down to zero. Once the workers are upgraded, they will be automatically put in online
 178 |   mode and start picking up new tasks. You can then upgrade the ``Scheduler`` in a rolling restart mode.
 179 | 
>180 | * For the :doc:`Kubernetes executor <../core-concepts/executor/kubernetes>`, you can upgrade the scheduler
 181 |   triggerer, webserver in a rolling restart mode, and generally you should not worry about the workers, as they
 182 |   are managed by the Kubernetes cluster and will be automatically adopted by ``Schedulers`` when they are
 183 |   upgraded and restarted.
 184 | 
 185 | * For the :doc:``CeleryKubernetesExecutor <../core-concepts/executor/celery-kubernetes>``, you follow the
------------------------------ Error   3 --------------------
 WARNING: unknown document: 'apache-airflow-providers-daskexecutor:dask_executor'

File path: apache-airflow/core-concepts/executor/index.rst (65)

  60 | 
  61 | **Remote Executors**
  62 | 
  63 | * :doc:`CeleryExecutor <apache-airflow-providers-celery:celery_executor>`
  64 | * :doc:`CeleryKubernetesExecutor <apache-airflow-providers-celery:celery_kubernetes_executor>`
> 65 | * :doc:`DaskExecutor <apache-airflow-providers-daskexecutor:dask_executor>`
  66 | * :doc:`KubernetesExecutor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`
  67 | * :doc:`KubernetesLocalExecutor <apache-airflow-providers-cncf-kubernetes:local_kubernetes_executor>`
  68 | 
  69 | 
  70 | .. note::
------------------------------ Error   4 --------------------
 WARNING: unknown document: 'executor/celery'

File path: apache-airflow/core-concepts/overview.rst (68)

  63 | ...................
  64 | 
  65 | Some Airflow components are optional and can enable better extensibility, scalability, and
  66 | performance in your Airflow:
  67 | 
> 68 | * Optional *worker*, which executes the tasks given to it by the scheduler. In the basic installation
  69 |   worker might be part of the scheduler not a separate component. It can be run as a long running process
  70 |   in the :doc:`CeleryExecutor <executor/celery>`, or as a POD in the
  71 |   :doc:`KubernetesExecutor <executor/kubernetes>`.
  72 | 
  73 | * Optional *triggerer*, which executes deferred tasks in an asyncio event loop. In basic installation
------------------------------ Error   5 --------------------
 WARNING: unknown document: 'executor/kubernetes'

File path: apache-airflow/core-concepts/overview.rst (68)

  63 | ...................
  64 | 
  65 | Some Airflow components are optional and can enable better extensibility, scalability, and
  66 | performance in your Airflow:
  67 | 
> 68 | * Optional *worker*, which executes the tasks given to it by the scheduler. In the basic installation
  69 |   worker might be part of the scheduler not a separate component. It can be run as a long running process
  70 |   in the :doc:`CeleryExecutor <executor/celery>`, or as a POD in the
  71 |   :doc:`KubernetesExecutor <executor/kubernetes>`.
  72 | 
  73 | * Optional *triggerer*, which executes deferred tasks in an asyncio event loop. 

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Feb 26, 2024

Can focus on this again -- will mark ready for review when tests are passing

@RNHTTR
Copy link
Collaborator Author

RNHTTR commented Feb 26, 2024

Closing this in favor of #37728

@RNHTTR RNHTTR closed this Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Executors-core LocalExecutor & SequentialExecutor area:providers full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:celery provider:cncf-kubernetes Kubernetes provider related issues provider:daskexecutor provider:datadog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate Celery/Kubernetes Executor docs to providers
4 participants