New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index_hadoop tasks fail on wrong file format when run inside indexer #8840
Comments
ok, I think I found something:
|
ok, so I fixed the config problem and revert to mainline. The problem is back. |
Hi @sixtus, Is this in Druid 0.16.0? Am I hearing you right that when you run on 0.16.0 using MM + peons, everything works fine, but when you run on 0.16.0 with an Indexer, you get this behavior where |
Yes, this is druid 0.16.0. I just fixed the config bug and reverted and the bug is back. I am not sure if it's running on peons (didn't validate that), but it sure isn't working when running on indexer. My PR has 2 commits. I was expecting it to work after the first and it didn't. But I think this was just a jar caching issue. I am about to test it again. |
What are your I am looking at the code that handles HDFS path colon replacement, and it hasn't changed in a while, and I don't see a reason for it to behave differently on MM/Peon vs Indexer. Also, we do have other users on 0.16.0 + HDFS deep storage. So it should still work… But if it doesn't please feed us some more clues and we can get to the bottom of it. |
It's working for compact and index_kafka. However index_hadoop is written by Hadoop itself, I think that's the problem. I just removed the broken commit from my PR.
and yes, I have been doing this for a while. I remember filing a bug back in the days. And now it's back. |
to be more precise, it's the hadoop reduce task that fails |
and there is no alternative to |
Hmm, I tried the Hadoop tutorial, which uses HDFS deep storage, and it worked ok for me. I wonder if something weird is going on with your setup. Do you have a stack trace from the reduce task that fails? |
we upgraded to 2.8.5 right around the same time as druid 0.16 |
The example is |
I just tried my patch, it's not working |
I just noticed the
I verified, there is no broken name in the meta storage. I.e. the kill task must generated it itself (rather than use the path from metastore) and then runs into the same trap. From my limited understanding, it looks like it's not instantiating |
It sounds like the config is specifying When the Druid processes start up, they'll log their configuration properties; for the indexer process, do you see it using |
I am using my own chef recipe (just for context: we are the first
production installation outside metamarket, we are using druid in
production since 2012), just for paranoia I used grep. Nothing "local".
Also the recipe used to work on 0.13.
On the staging system, I switched to Minio (s3) and there I get a valid
segment in minio, but "local" as type in the payload.
…On Fri, Dec 6, 2019 at 3:40 AM Jonathan Wei ***@***.***> wrote:
It sounds like the config is specifying druid.storage.type=local
somewhere, is it possible that there's a stray/stale entry for that
somewhere in your runtime properties?
When the Druid processes start up, they'll log their configuration
properties; for the indexer process, do you see it using
druid.storage.type=hdfs at that point?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8840>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAWNUOJKPKIWRMY7KRMUMLQXG3QHANCNFSM4JKFAJSQ>
.
--
*Hagen Rother*
Lead Architect | LiquidM
------------------------------
LiquidM Technology GmbH
Invalidenstraße 74 | 10557 Berlin | Germany
Phone: +49 176 15 00 38 77
Internet: www.liquidm.com | LinkedIn
<
http://www.linkedin.com/company/3488199?trk=tyah&trkInfo=tas%3AliquidM%2Cidx%3A1-2-2>
------------------------------
Managing Directors | Philipp Simon & Thomas Hille
Jurisdiction | Local Court Berlin-Charlottenburg HRB 152426 B
|
I think I found something:
the staging system started working as expected when I used a middleManager
rather than an indexer.
a) any idea where to look?
b) can I pin a certain type of work to a specific middleManager (I only
seen by data source)
c) can I run a 2nd (independent) set over overlord/middlemanager to split
index_kafka and index_hadoop? The thread model of the middleManager is
killing performance for index_kafka
Thanks!
Hagen
On Mon, Dec 9, 2019 at 9:09 AM Hagen Rother <hagen.rother@liquidm.com>
wrote:
… I am using my own chef recipe (just for context: we are the first
production installation outside metamarket, we are using druid in
production since 2012), just for paranoia I used grep. Nothing "local".
Also the recipe used to work on 0.13.
On the staging system, I switched to Minio (s3) and there I get a valid
segment in minio, but "local" as type in the payload.
On Fri, Dec 6, 2019 at 3:40 AM Jonathan Wei ***@***.***>
wrote:
> It sounds like the config is specifying druid.storage.type=local
> somewhere, is it possible that there's a stray/stale entry for that
> somewhere in your runtime properties?
>
> When the Druid processes start up, they'll log their configuration
> properties; for the indexer process, do you see it using
> druid.storage.type=hdfs at that point?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#8840>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAAWNUOJKPKIWRMY7KRMUMLQXG3QHANCNFSM4JKFAJSQ>
> .
>
--
*Hagen Rother*
Lead Architect | LiquidM
------------------------------
LiquidM Technology GmbH
Invalidenstraße 74 | 10557 Berlin | Germany
Phone: +49 176 15 00 38 77
Internet: www.liquidm.com | LinkedIn
<
http://www.linkedin.com/company/3488199?trk=tyah&trkInfo=tas%3AliquidM%2Cidx%3A1-2-2>
------------------------------
Managing Directors | Philipp Simon & Thomas Hille
Jurisdiction | Local Court Berlin-Charlottenburg HRB 152426 B
--
*Hagen Rother*
Lead Architect | LiquidM
------------------------------
LiquidM Technology GmbH
Invalidenstraße 74 | 10557 Berlin | Germany
Phone: +49 176 15 00 38 77
Internet: www.liquidm.com | LinkedIn
<
http://www.linkedin.com/company/3488199?trk=tyah&trkInfo=tas%3AliquidM%2Cidx%3A1-2-2>
------------------------------
Managing Directors | Philipp Simon & Thomas Hille
Jurisdiction | Local Court Berlin-Charlottenburg HRB 152426 B
|
@sixtus I was able to reproduce this, it appears to be an issue where the Hadoop mapper or reducer is not picking up the |
we have seen this before, it's not replacing
:
and thus HDFS refuses the file as it's illegal.I guess yet another thing fixed in peon but not in indexer?
java.lang.IllegalArgumentException: Pathname /druid/indexer/foo/2019-10-29T00:00:00.000Z_2019-10-30T00:00:00.000Z/2019-11-06T19:57:56.216Z/29/index.zip.0 from hdfs://us2/druid/indexer/foo/2019-10-29T00:00:00.000Z_2019-10-30T00:00:00.000Z/2019-11-06T19:57:56.216Z/29/index.zip.0 is not a valid DFS filename.
The text was updated successfully, but these errors were encountered: