-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yet another NPE #35
Comments
JFYI, I wiped it out and started with fresh backup. Feel free to close this issue. |
Hmm… This is weird… The relevant lines of code are these here: kafka-backup/src/main/java/de/azapps/kafkabackup/sink/BackupSinkTask.java Lines 121 to 122 in f30b9ad
Which is called for every TopicPartition to open… It does not make sense to me why this happens :(
Do you have your data still available for further debugging? Would be interesting to try to reproduce. We should at least show a meaningful error instead of just throwing a NPE… |
Hi! Yes, I have old directory backup still. Though using it may affect current cluster backup state I guess as I didn't changed sink name.. |
My guess here is this target dir was broken in some way during eCryptfs enabling attempts. Maybe some file was changed accidentally or something like this. |
Hmm… Does it contain sensitive information? If not feel free to upload it to https://send.firefox.com/ and send me a link. I would try to reproduce the error. |
Happened today on another cluster too... UPD. I did logs check just now. NPE happened earlier actually. Kafka-backup was killed by OOM multiple times... It seems Do you want me to do some debugging on this data? I cannot upload it as it's contains company's sensitive data... |
BTW can failed sink cause kafka-connect to exit? It'd be nice if whole standalone connect process will fail in case of single sink failure (when no other sink/connector exists). |
Sorry did miss the update. Feel free to just add another comment ;) I have no idea how to calculate that right now. Have opened a new ticket #47 for that discussion
Yes please! That would be awesome! |
I'm not skilled in java debugging unfortunately... though I can run something if you guide me on this. |
Ok, I will try to think how to do that during the next days maybe I find the issue myself by chance 😂 (want to write more tests) |
According to what I saw here, I'd suggest you to kill kafka-backup process by |
I have seen it today in my test setup too… Currently I am failing to reproduce it. Will try it again during the next days… |
Hit into this again after few hours of #88 and one OOM.. |
Saw this tonight on kafka-backup service shutdown before doing Azure blobstore backup:
|
🤔 maybe i should try to reproduce it by letting run Kafka Backup for a few hours and producing a lot of data… need to think about how to debug that in the most meaningful manner… I think it would help a bit if you could monitor your Kafka Backup setup. Maybe we will see something useful in the metrics 🤔 |
I reproduce the same error:
I am using a pod setup en k8s and Azure File Share filesystem to store the backup. I will try to add some logs at this point. |
Just hit into NPE below yesterday (using commit 3c95089). Tried today with latest commit from master (f30b9ad) though it's still here. Output below is from latest version.
What's changed. I did migration to eCryptfs. I stopped kafka-backup, renamed target dir, emptied and
chattr +i
'd backup sink config (to prevent kafka-backup to be started by Puppet again). Then I deployed eCryptfs changes, did rsync back, then un-chattr +i
'd it and reapplied Puppet.Now main question should we try to debug this at all? Or should I just wipe it and do another fresh backup? This is QA so we have some time in case.
The text was updated successfully, but these errors were encountered: