Skip to content

elastic-search index creation  #29280

@riyaj8888

Description

@riyaj8888

Apache Airflow Provider(s)

elasticsearch

Versions of Apache Airflow Providers

apache-airflow-providers-elasticsearch == 4.3.3

Apache Airflow version

2.3.2

Operating System

windows

Deployment

Docker-Compose

Deployment details

No response

What happened

in docker-compose i have added elastic-search as service and using that service i am trying to connect to elastic cluster and create index in DAG file but i am getting following errors
new connection refused

`from airflow.providers.elasticsearch.hooks.elasticsearch import ElasticsearchHook

from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime,timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from elasticsearch import Elasticsearch

default_args = {"owner":"default",
"depend_on_past":False,
"start_date":datetime(2023, 1 ,26),
"retries":1,
"retry_delay":timedelta(minutes=15)}

mappings = {

"properties": {
    "title": {"type": "text"},
    "description": {"type": "text"},
    "sent-emb": {
        "type": "dense_vector",
        "dims": 768,
        "index": True,
        "similarity": "l2_norm"
    }

}
}

def get_activated_sources():
dense_vector_dim = 768
index_name = 'test_es'
es = Elasticsearch("http://localhost:9200")
# es.indices.create(index=index_name, body=mappings, ignore=400)
# es_index(es,dense_vector_dim,index_name)
# es_hook = ElasticsearchHook(hosts= ['http://localhost:9200'])
# connection = es_hook.get_conn()
print(es)

with DAG ("es_dag",default_args=default_args,schedule_interval="@daily",catchup=False) as dag:

start_task = DummyOperator(task_id="dummpy_task")

hook_task = PythonOperator(task_id="hook_task",python_callable=get_activated_sources)

start_task >> hook_task

`

I am able to connect to elastic-search cluster ,but when i am trying run
es.indices.create(index=index_name, body=mappings, ignore=400) this throwing error.

can we create elastic index inside airflow DAG or not?

thanks

What you think should happen instead

same script is able to create elastic index on local machine , but Inside DAG script its throwing error.

How to reproduce

in docker-compose i have added elastic-search as service and using that service i am trying to connect to elastic cluster and create index in DAG file but i am getting following errors
new connection refused

`from airflow.providers.elasticsearch.hooks.elasticsearch import ElasticsearchHook

from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime,timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from elasticsearch import Elasticsearch

default_args = {"owner":"default",
"depend_on_past":False,
"start_date":datetime(2023, 1 ,26),
"retries":1,
"retry_delay":timedelta(minutes=15)}

mappings = {

"properties": {
    "title": {"type": "text"},
    "description": {"type": "text"},
    "sent-emb": {
        "type": "dense_vector",
        "dims": 768,
        "index": True,
        "similarity": "l2_norm"
    }

}
}

def get_activated_sources():
dense_vector_dim = 768
index_name = 'test_es'
es = Elasticsearch("http://localhost:9200")
# es.indices.create(index=index_name, body=mappings, ignore=400)
# es_index(es,dense_vector_dim,index_name)
# es_hook = ElasticsearchHook(hosts= ['http://localhost:9200'])
# connection = es_hook.get_conn()
print(es)

with DAG ("es_dag",default_args=default_args,schedule_interval="@daily",catchup=False) as dag:

start_task = DummyOperator(task_id="dummpy_task")

hook_task = PythonOperator(task_id="hook_task",python_callable=get_activated_sources)

start_task >> hook_task

`

I am able to connect to elastic-search cluster ,but when i am trying run
es.indices.create(index=index_name, body=mappings, ignore=400) this throwing error.

can we create elastic index inside airflow DAG or not?

thanks

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions