Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update XslProcessApi to process the draft copies for the metadata in a workflow #5811

Merged
merged 2 commits into from Nov 9, 2021

Conversation

josegar74
Copy link
Member

When the workflow is enabled the batch processing works fine in these cases:

  • records are in draft
  • records approved without working copy

But it's not working fine If the metadata is approved and has a working copy. In this case it's processing all the records, but updating only the approved version, not the draft version.

This PR updates the batch processing API to handle both versions of the metadata. For now, in the total number of records are counted the approved and draft versions. This is something to improve in the report, but will require some refactor of the current report.

@jodygarnett
Copy link
Contributor

When this branch using a fresh checkout of main main with es 7.6.2 an error is displayed:

⚠️ Query returned an error. Check the console for details.
Close

While geoetwork logs are empty, the browser console shows some errors:

@jodygarnett
Copy link
Contributor

Loading sample data, indicates sucess in the UI, but the resulting editor board is empty.

Checking logs shows:

2021-07-13 17:12:52,626 ERROR [geonetwork.index] - Document with error #5a3bfe85-e854-46b5-8146-6519d1a4ac59: ElasticsearchException[Elasticsearch exception [type=cluster_block_exception, reason=index [gn-records] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]].

Going to try and recate the ES environment and try again.

@jodygarnett
Copy link
Contributor

Notes:

  1. Setting up a new es using cd es; docker compose up
  2. Application still does not work, same web console errors as previously
  3. use Admin console > metadata & templates to load samples for iso19139
  4. Reviewing geonetwork logs for loading the sample data:
2021-07-13 17:20:42,760 ERROR [geonetwork.wro4j] - Error occurred during a wro4j request handling
ro.isdc.wro.WroRuntimeException: Cannot build valid CacheKey from request: /geonetwork/static/bootstrap-tagsinput.min.js.map
...
2021-07-13 17:20:42,769 ERROR [ro.isdc.wro.http.WroFilter] - Exception occured
ro.isdc.wro.WroRuntimeException: Cannot build valid CacheKey from request: /geonetwork/static/bootstrap-tagsinput.min.js.map
...
2021-07-13 17:20:56,017 ERROR [geonetwork.index] - Document with error #78f93047-74f8-4419-ac3d-fc62e4b0477b: ElasticsearchException[Elasticsearch exception [type=cluster_block_exception, reason=index [gn-records] blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]].
...
2021-07-13 17:20:56,018 ERROR [geonetwork.index] - {"docType":"metadata","document":""...

Copy link
Contributor

@jodygarnett jodygarnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josegar74 this PR is preventing loading sample data, so I am unable to test the ability to bulk process with a draft metadata record in the mix.

I also tested the main branch on its own, while the ui errors still occur, it can add sample records.

@josegar74
Copy link
Member Author

@jodygarnett I tested with latest main branch, with and without this pull request and with / without the workflow enabled, to load the samples for iso19139 and i got no issue.

Afaik, the code changes are not involved in the sample loading.

Steps done, using a ES docker image:

  1. Checkout / execute the code
git clone https://github.com/geonetwork/core-geonetwork 
cd core-geonetwork
git submodule update --init
mvn -T 2C clean install -DskipTests 
cd web 
mvn jetty:run
  1. Login as administrator and load iso19139 samples

  2. Check the search page and the jetty console for no errors

@jodygarnett
Copy link
Contributor

Some of the trouble is the instructions for running ES are not yet merged (so I do not know exactly what you are running). I also note the mvn clean:clean@reset target is not ported to main yet (so no way to reset state).

Starting from a fresh checkout of main:

git clone https://github.com/geonetwork/core-geonetwork core-geonetwork5
cd core-geonetwork5
git submodule update --init
mvn -T 2C clean install -DskipTests 
cd es
docker-compose up
cd web
mvn jetty:run

Notes

  • Updated to latest Docker Desktop 3.5.2, Compose 1.29.2, Engine 20.10.7
  • Firefox 89.0.2

Result:

image

@jodygarnett
Copy link
Contributor

Ignoring the UI error and trying a few things:

  • upgrade docker-compose.yml to use 7.10.2
  • load sample iso19139 sample data

The user interface indicates the sample data loaded, but the result is not shown in the editor console.

The editor console initially takes 2 mins to load.

The logs show some failures when loading sample data:

2021-07-14 10:51:04,483 ERROR [geonetwork.index] - Document with error #78f93047-74f8-4419-ac3d-fc62e4b0477b: ElasticsearchException[Elasticsearch exception [type=cluster_block_exception, reason=index [gn-records] blocked by: [TOO_MANY_REQUESTS/12/dis

@ianwallen
Copy link
Contributor

@jodygarnett
I sometimes have odd issues with the index (similar to what you are seeing). I will generally go to admin console -> tools -> delete index and reindex and it will generally solve my issues.
Now if only I can figure out what causes the index to get messed up in the first place....

@jodygarnett
Copy link
Contributor

Notes:

  • I checked out docker-compose.yaml and remove the contents of es/es-dashboards/data, so now eslasticsearch stats up with out any errors in its logs. action: I think es needs a mvn clean:clean@reset goal.
  • The main application still has Query returned an error. Check the console for details. popup on every page load.
  • Using Admin > Tools to delete index and reindex did not have any effect.
  • Loading the 5 iso19139 sample data results in TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block messages in the eslastic search / kabana log
    k`
  • the catalog remains empty

@ianwallen
Copy link
Contributor

Regarding the "read-only-allow-delete block", I have been getting that on my localhost and it has been related to the disk space. By default if you have less than 5% free disk space ES will block writes. In my case I have a large disk so even when I have 100GB remaining on my disk, I'm still below the 5% threshold.

You can read more on it at the following url.

https://stackoverflow.com/questions/50609417/elasticsearch-error-cluster-block-exception-forbidden-12-index-read-only-all

Copy link
Contributor

@juanluisrp juanluisrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with this procedure:

  • Create 5 new records (A, B, C, D and E)
  • Enable the workflow in A record
  • Publish record A.
  • Create a draft copy of A, A'.
  • Add records A, B, and C to a selection bucket.
  • Execute a batch process using the API passing the name of the bucket.
  • In the response numberOfRecords and numberOfRecordsProcessed are set to 4. In the metadata field 4 ids are returning, including the A' draft's id.
{
  "errors": [],
  "infos": [],
  "uuid": "858f74c8-25bb-43e8-8aa7-acaf04e6eafd",
  "metadata": [
    128,
    123,
    124,
    125
  ],
  "metadataErrors": {},
  "metadataInfos": {},
  "processId": "url-host-relocator",
  "noProcessFoundCount": 0,
  "numberOfRecordNotFound": 0,
  "numberOfRecordsNotEditable": 0,
  "numberOfRecordsWithErrors": 0,
  "numberOfRecordsProcessed": 4,
  "numberOfRecords": 4,
  "numberOfNullRecords": 0,
  "startIsoDateTime": "2021-11-09T11:17:34.765Z",
  "endIsoDateTime": "2021-11-09T11:17:37.813Z",
  "ellapsedTimeInSeconds": 3,
  "totalTimeInSeconds": 3,
  "running": false,
  "type": "XsltMetadataProcessingReport"
}

@josegar74 josegar74 merged commit f7172e2 into geonetwork:main Nov 9, 2021
josegar74 added a commit that referenced this pull request Nov 9, 2021
…a in a workflow (#5811)

* Update XslProcessApi to process the draft copies for the metadata in a workflow

* mend
josegar74 added a commit that referenced this pull request Nov 9, 2021
…a in a workflow (#5811)

* Update XslProcessApi to process the draft copies for the metadata in a workflow

* mend
@juanluisrp juanluisrp mentioned this pull request Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants