This Python script is designed to perform bulk Elasticsearch data update operations. It accomplishes the following tasks:
- Reads document data from a JSON file.
- Parallelizes document updates.
- Sends each update operation to Elasticsearch in bulk.
- This script reads document data from a JSON file and performs document updates on Elasticsearch.
- The JSON file should have the following format:
{
"organizationId1": ["requesterId1", "requesterId2", ...],
"organizationId2": ["requesterId3", "requesterId4", ...],
...
}
- After the execution, it prints start and end times along with the processing time.
- Python 3.6 or higher
- Elasticsearch 7.6.2 or higher
requests
library for Pythonresult.json
file containing the document data to be updated. (This file should be in the same directory as the script)
This query can be used to generate the result.json
file from the database.
SELECT json_agg(json_build_array(organization_id, user_ids))
FROM (SELECT au.organization_id, jsonb_agg(DISTINCT au.id) AS user_ids
FROM ticket t
LEFT JOIN app_user au ON t.requester_id = au.id
WHERE au.organization_id IS NOT NULL
GROUP BY au.organization_id
) subquery;
- Install the
requests
andargparse
libraries for Python.- or run the following command:
pip install -r requirements.txt
- Run the script with the following command:
- tenantId: The tenantId to be updated.
- env: The environment to be updated.
python script.py --tenantId develop --env local