Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark GeoIpDownloaderTask as completed after cancellation #84028

Merged
merged 107 commits into from Feb 28, 2022

Conversation

probakowski
Copy link
Contributor

Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader

@probakowski probakowski added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.17.1 v8.0.1 v8.2.0 v8.1.1 labels Feb 16, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Feb 16, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @probakowski, I've created a changelog YAML for you.

@probakowski probakowski added the auto-backport-and-merge Automatically create backport pull requests and merge when ready label Feb 16, 2022
Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Premzko, this is indeed the bugfix I had in mind. I think we could also add an integration that isolate a node until the task is assigned somewhere else, and then check it is correctly cancelled once the node joins back the cluster. WDYT?

@arteam arteam self-requested a review February 16, 2022 13:50
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production code change LGTM

I agree with @tlrx that we do need an integration test,
that verifies that the node tasks get cleaned up after
a new node task is started by persistent tasks framework.
(I think temporarily isolating the node that runs geoip
downloader task is a good idea to trigger this bug)

@probakowski
Copy link
Contributor Author

That makes sense, I'll add test case

stu-elastic and others added 9 commits February 23, 2022 22:41
Adds the fields API for `dense_vector` field mapper.

Adds a `DenseVector` interface for the value type.

Implemented by:
 * `KnnDenseVector` which wraps a decoded float array from `VectorValues`
 * `BinaryDenseVector` which lazily decodes a `BytesRef` from `BinaryDocValues`

The vector operations have moved into those implements from `BinaryDenseVectorScriptDocValues.java` and  `KnnDenseVectorScriptDocValues.java`, respectively.

The `DenseVector` API is:
```
float getMagnitude();
double dotProduct(float[] | List);
double l1Norm(float[] | List);
double l2Norm(float[] | List);
float[] getVector();
int dims();

boolean isEmpty(); // does the value exist
int size();        // 0 if isEmpty(), 1 otherwise
Iterator<Float> iterator()
```

`dotProduct`, `l1Norm` and `l2Norm` take a `float[]` or a `List` via the
a delegating `default` method on the `DenseVector` interface.

The `DenseVectorDocValuesField` abstract class contains two getter APIS.
It is implemented by  `KnnDenseVectorDocValuesField` and
`BinaryDenseVectorDocValuesField`.

```
DenseVector get()
DenseVector get(DenseVector defaultValue)
```

The `get()` method is included because there isn't a good default dense vector,
so that API returns an empty `DenseVector` which throws an
`IllegalArgumentException` for all method calls other than `isEmpty()`,
`size()` and `iterator()`.

The empty dense vector will always be `DenseVector.EMPTY` in case users want
to use equality checks.

Refs: elastic#79105
If rolling upgrade was used from version prior GeoIPv2 (<`7.14`) then
geoip downloader wouldn't be started so no new databases were
downloaded. This is especially troubling in `8.x` as we no longer
provide default databases inside ES so after upgrade no geoip enrichment
can take place until downloader is started with workaround (setting
`ingest.geoip.downloader.enabled` to `false` and `true` again). This is
because logic that was used to lower number of requests / cluster update
listeners at the startup was too optimistic about order of actions / who
can be elected master at what time.  This change fixes that and also
cleans up logs when there are some ignorable errors and adds debug
logging on start and stop of the task to ease up troubleshooting. It
also adds rolling upgrade test to make sure the fix works.
bug Introduced by elastic#83835

This switches back our token tagging to take into account the tokens position when reconstituting and tagging tokens for NER.
This PR unmutes some tests that were muted for backporting elastic#83290
@dakrone dakrone self-assigned this Feb 24, 2022
Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@probakowski probakowski merged commit 0116cce into elastic:master Feb 28, 2022
@probakowski probakowski deleted the complete-geoip branch February 28, 2022 09:30
dakrone pushed a commit to dakrone/elasticsearch that referenced this pull request Feb 28, 2022
Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader
nik9000 pushed a commit to nik9000/elasticsearch that referenced this pull request Feb 28, 2022
Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader
@DaveCTurner
Copy link
Contributor

Looks like this is backport-pending (I've added the label) but note that it adds a flaky test which is currently muted in master - see #84617.

tlrx pushed a commit to tlrx/elasticsearch that referenced this pull request Mar 3, 2022
Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader
gmarouli pushed a commit to gmarouli/elasticsearch that referenced this pull request Mar 16, 2022
Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader
@gmarouli gmarouli removed backport pending auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.1.1 v8.0.2 v7.17.2 labels Mar 16, 2022
@gmarouli
Copy link
Contributor

Backport will be handled by #85014

gmarouli added a commit that referenced this pull request Mar 16, 2022
Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader.
gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Mar 16, 2022
elastic#85014)

Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader.
gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Mar 16, 2022
elastic#85014)

Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader.
elasticsearchmachine pushed a commit that referenced this pull request Mar 16, 2022
…) (#85023)

Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader.
elasticsearchmachine pushed a commit that referenced this pull request Mar 16, 2022
…) (#85022)

Persistent task framework requires tasks to be marked either completed or failed after cancellation. This was missing in GeoIpDownloader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet