Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSCrawler 2.7-SNAPSHOT does not support Elasticsearch between 6.0.0 and 6.6.x #713

Closed
Gowtham029 opened this issue Apr 16, 2019 · 15 comments · Fixed by #720
Closed

FSCrawler 2.7-SNAPSHOT does not support Elasticsearch between 6.0.0 and 6.6.x #713

Gowtham029 opened this issue Apr 16, 2019 · 15 comments · Fixed by #720
Assignees
Labels
breaking For breaking changes
Milestone

Comments

@Gowtham029
Copy link

When I try to run the job I'm getting the following error

Config File

name: "hello"
fs:
  url: "/home/gowtham/Documents"
  update_rate: "15m"
  excludes:
  - "*/~*"
  json_support: false
  filename_as_id: false
  add_filesize: true
  remove_deleted: true
  add_as_inner_object: false
  store_source: false
  index_content: true
  attributes_support: false
  raw_metadata: false
  xml_support: false
  index_folders: true
  lang_detect: false
  continue_on_error: false
  ocr:
    language: "eng"
    enabled: true
    pdf_strategy: "ocr_and_text"
elasticsearch:
  nodes:
  - url: "http://10.0.2.2:9200"
  bulk_size: 100
  flush_interval: "5s"
  byte_size: "10mb"
  index : "hello"

Error:

12:46:22,477 WARN  [f.p.e.c.f.c.v.ElasticsearchClientV6] failed to create index [hello], disabling crawler...
12:46:22,478 FATAL [f.p.e.c.f.c.FsCrawlerCli] Fatal error received while running the crawler: [Elasticsearch exception [type=illegal_argument_exception, reason=request [/hello] contains unrecognized parameter: [include_type_name]]]
12:46:22,478 DEBUG [f.p.e.c.f.c.FsCrawlerCli] error caught
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=illegal_argument_exception, reason=request [/hello] contains unrecognized parameter: [include_type_name]]
    at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) ~[elasticsearch-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2053) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2030) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1777) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1734) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1696) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at org.elasticsearch.client.IndicesClient.create(IndicesClient.java:191) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
    at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndex(ElasticsearchClientV6.java:240) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
    at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndex(ElasticsearchClientV6.java:603) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
    at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndices(ElasticsearchClientV6.java:436) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
    at fr.pilato.elasticsearch.crawler.fs.FsCrawlerImpl.start(FsCrawlerImpl.java:161) ~[fscrawler-core-2.7-SNAPSHOT.jar:?]
    at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawlerCli.main(FsCrawlerCli.java:270) [fscrawler-cli-2.7-SNAPSHOT.jar:?]
    Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://10.0.2.2:9200], URI [/hello?master_timeout=30s&include_type_name=true&timeout=30s], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"request [/hello] contains unrecognized parameter: [include_type_name]"}],"type":"illegal_argument_exception","reason":"request [/hello] contains unrecognized parameter: [include_type_name]"},"status":400}
        at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:936) ~[elasticsearch-rest-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:233) ~[elasticsearch-rest-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1764) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1734) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1696) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.IndicesClient.create(IndicesClient.java:191) ~[elasticsearch-rest-high-level-client-6.7.1.jar:6.7.1]
        at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndex(ElasticsearchClientV6.java:240) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndex(ElasticsearchClientV6.java:603) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.client.v6.ElasticsearchClientV6.createIndices(ElasticsearchClientV6.java:436) ~[fscrawler-elasticsearch-client-v6-2.7-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.FsCrawlerImpl.start(FsCrawlerImpl.java:161) ~[fscrawler-core-2.7-SNAPSHOT.jar:?]
        at fr.pilato.elasticsearch.crawler.fs.cli.FsCrawlerCli.main(FsCrawlerCli.java:270) [fscrawler-cli-2.7-SNAPSHOT.jar:?]
    Caused by: org.elasticsearch.client.ResponseException: method [PUT], host [http://10.0.2.2:9200], URI [/hello?master_timeout=30s&include_type_name=true&timeout=30s], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"request [/hello] contains unrecognized parameter: [include_type_name]"}],"type":"illegal_argument_exception","reason":"request [/hello] contains unrecognized parameter: [include_type_name]"},"status":400}
        at org.elasticsearch.client.RestClient$1.completed(RestClient.java:552) ~[elasticsearch-rest-client-6.7.1.jar:6.7.1]
        at org.elasticsearch.client.RestClient$1.completed(RestClient.java:537) ~[elasticsearch-rest-client-6.7.1.jar:6.7.1]
        at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119) ~[httpcore-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) ~[httpasyncclient-4.1.2.jar:4.1.2]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) ~[httpcore-nio-4.4.5.jar:4.4.5]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201]
12:46:22,484 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [hello]
12:46:22,485 DEBUG [f.p.e.c.f.c.v.ElasticsearchClientV6] Closing Elasticsearch client manager
12:46:22,486 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
12:46:22,487 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [hello] stopped
@dadoonet
Copy link
Owner

What is your elasticsearch version?
You might have to use 6.7

@Gowtham029
Copy link
Author

Its perfectly working with elasticsearch 6.7, Thanks a lot @dadoonet.

@dadoonet
Copy link
Owner

Great. But could you tell me what was your version?

@Gowtham029
Copy link
Author

@dadoonet I tried with 6.4.

@dadoonet
Copy link
Owner

Thanks!

@HY3798
Copy link

HY3798 commented Apr 23, 2019

Can you share how you resolved it

@dadoonet
Copy link
Owner

@HY3798 by updating elasticsearch version to 6.7

@HY3798
Copy link

HY3798 commented Apr 23, 2019

will that work in es-7

@dadoonet
Copy link
Owner

Yes. It's tested against es 7.0

@HY3798
Copy link

HY3798 commented Apr 23, 2019

Thank you, now I need to upgrade my production.

@dadoonet
Copy link
Owner

now I need to upgrade my production.

If you are still using FSCrawler 2.6, that should be OK without upgrading.
From which version to which version do you want to upgrade?

@HY3798
Copy link

HY3798 commented Apr 23, 2019

I am currently on ElasticSearch [ELK-Stack] on 6.6.2, I need FSCrawler-es6-2.7-SNAPSHOT to work on it. But its throwing error as given below

D:\usr\Server\ELK\6.6.2\fscrawler-es6-2.7-SNAPSHOT>bin\fscrawler --config_dir ./test myfiles
20:16:12,961 INFO  [f.p.e.c.f.c.v.ElasticsearchClientV6] Elasticsearch Client for version 6.x connected to a node running version 6.6.2
20:16:13,059 INFO  [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
20:16:13,060 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler started in watch mode. It will run unless you stop it with CTRL+C.
20:16:13,453 WARN  [f.p.e.c.f.c.v.ElasticsearchClientV6] failed to create index [myfiles], disabling crawler...
20:16:13,455 FATAL [f.p.e.c.f.c.FsCrawlerCli] Fatal error received while running the crawler: [Elasticsearch exception [type=illegal_argument_exception, reason=request [/myfiles] contains unrecognized parameter: [include_type_name]]]
20:16:13,461 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [myfiles] stopped
20:16:13,464 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [myfiles] stopped

@dadoonet
Copy link
Owner

I think you can use the one from this page: https://fscrawler.readthedocs.io/en/fscrawler-2.6/installation.html

Which links to https://repo1.maven.org/maven2/fr/pilato/elasticsearch/crawler/fscrawler-es6/2.6/fscrawler-es6-2.6.zip

This is FSCrawler 2.6 with a compatible elasticsearch client for version 6.

@HY3798
Copy link

HY3798 commented Apr 23, 2019

Thank you David. It works with FSC-2.6, Saved my time on upgrade of ELK for now.

@dadoonet
Copy link
Owner

For the record I opened elastic/elasticsearch#41647 in Elasticsearch project.

@dadoonet dadoonet changed the title failed to create index [hello], disabling crawler... FSCrawler 2.7-SNAPSHOT does not support Elasticsearch between 6.0.0 and 6.6.x Apr 29, 2019
dadoonet added a commit that referenced this issue May 6, 2019
Due to a change in 6.7 HLRest Client, FSCrawler ES6X distribution does not support anymore a version before 6.7.
This probably won't be fixed as per discussion in elastic/elasticsearch#41647 so we need to document this breaking change.

Closes #713
@dadoonet dadoonet added the breaking For breaking changes label May 6, 2019
@dadoonet dadoonet added this to the 2.7 milestone May 6, 2019
@dadoonet dadoonet self-assigned this May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking For breaking changes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants