Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid metadata EA errors #236

Closed
Adubzzy opened this issue Mar 2, 2023 · 31 comments
Closed

Invalid metadata EA errors #236

Adubzzy opened this issue Mar 2, 2023 · 31 comments
Assignees

Comments

@Adubzzy
Copy link

Adubzzy commented Mar 2, 2023

I am running "12.3-RELEASE-p11 FreeBSD 12.3-RELEASE-p11 GENERIC amd64" and have just updated to the newest netatalk port using postmaster.

netatalk3-3.1.14,1 File server for Mac OS X

I read several reports on sourceforge documenting the issue I am seeing below:

https://sourceforge.net/p/netatalk/bugs/670/?limit=25

same recurring error in the log and that this would be addressed in the newest release. I do not think it was addressed as my logs are still being spammed with the message. I even rebooted the machine to make sure all libraries and apps start up organically.

My afpd logs show this error every few seconds while I have a share open:

Mar 02 16:29:24.771487 afpd[1134] {ad_open.c:818} (error:ad): ad_header_read_ea("/usr/home/alex"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 02 16:29:24.771494 afpd[1134] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument

@rdmark
Copy link
Member

rdmark commented Mar 2, 2023

@Adubzzy
Copy link
Author

Adubzzy commented Mar 3, 2023

I don't have those tools installed besides that is a bit out of my depth too. The odd thing is that from what I do know, even though the logs are saying a core file is being created the process is not dying nor is there performance issues so even if I did what you said no core file would be created if the process didn't die correct? If I didn't look in the log I would not have noticed anything. Netatalk is working fine, only thing is that I noticed that since upgrading the log files were getting huge since they were constantly being spammed by that message .

Edit: I know it's not the proper thing to do but I just turned logging on to severe and the logs stopped being spammed, but I imagine the underlying issue is still happening but I am not noticing anything out of the ordinary in terms of netatalk.

@rdmark
Copy link
Member

rdmark commented Mar 3, 2023

Aha, I see. Yes this is unusual, since the invalid metadata EA error led to a segfault crash in the previous netatalk3 release version, IINM. So either the error message here is bogus, or an assertion is missing in the code.

Can you please upload a more complete log, just so that we can see if there are other leads earlier in the chain of messages?

@Adubzzy
Copy link
Author

Adubzzy commented Mar 3, 2023

Some more background... I have 4 shares, 1 is a home dir, 1 is a regular dir and 2 are Time Machine dirs. The only ones that cause this error are the home dir and the regular dir. I thought is had something to do with the "." files so like .DS_Store so I deleted them and recreated them but it still causes an issue. In the logs below you can see I mounted /storage/stuff which causes the errors and then I mounted my Tmemachine dir and there was no error.

Mar 03 12:23:54.050819 netatalk[836] {netatalk.c:447} (note:Default): Netatalk AFP server starting
Mar 03 12:23:54.066423 netatalk[836] {afp_avahi.c:314} (info:AFPDaemon): Successfully started avahi loop.
Mar 03 12:23:54.066453 netatalk[836] {netatalk.c:517} (note:Default): Registered with Zeroconf
Mar 03 12:23:54.164178 cnid_metad[860] {cnid_metad.c:510} (note:AFPDaemon): CNID Server listening on localhost:4700
Mar 03 12:23:54.195635 netatalk[836] {afp_avahi.c:80} (info:AFPDaemon): Registering volume 'TimeMachineLB' with UUID: '19848292-5CCE-BDD8-FC49-86770F1BE54E' for TimeMachine
Mar 03 12:23:54.195652 netatalk[836] {afp_avahi.c:80} (info:AFPDaemon): Registering volume 'TimeMachine2' with UUID: 'F6C1EE2B-CBE7-26A3-DB7B-9E794C200984' for TimeMachine
Mar 03 12:23:54.195658 netatalk[836] {afp_avahi.c:94} (info:AFPDaemon): hostname: meatwad
Mar 03 12:23:54.195661 netatalk[836] {afp_avahi.c:118} (info:AFPDaemon): Registering server 'meatwad' with Bonjour
Mar 03 12:23:56.026925 afpd[859] {status.c:644} (info:AFPDaemon): signature is 705089E8ECF2836852C659F4AFCD030F
Mar 03 12:23:56.027207 afpd[859] {dsi_tcp.c:521} (info:DSI): dsi_tcp: hostname 'meatwad' resolves to loopback address
Mar 03 12:23:56.027234 afpd[859] {dsi_tcp.c:289} (info:DSI): dsi_tcp: '192.168.99.100:548' on interface 'em0' will be used instead.
Mar 03 12:23:56.027265 afpd[859] {auth.c:111} (info:AFPDaemon): uam: "DHX2" available
Mar 03 12:23:56.027268 afpd[859] {auth.c:111} (info:AFPDaemon): uam: "DHCAST128" available
Mar 03 12:23:56.027271 afpd[859] {status.c:363} (info:AFPDaemon): servername: meatwad
Mar 03 12:23:56.027276 afpd[859] {afp_config.c:198} (note:AFPDaemon): Netatalk AFP/TCP listening on 192.168.99.100:548
Mar 03 12:23:56.027287 afpd[859] {cnid.c:55} (info:AFPDaemon): Registering CNID module [last]
Mar 03 12:23:56.027289 afpd[859] {cnid.c:55} (info:AFPDaemon): Registering CNID module [dbd]
Mar 03 12:23:56.027292 afpd[859] {cnid.c:55} (info:AFPDaemon): Registering CNID module [tdb]
Mar 03 12:24:33.611643 afpd[859] {main.c:149} (info:AFPDaemon): child[936]: exited 4
Mar 03 12:24:33.613944 afpd[937] {dsi_tcp.c:241} (info:DSI): AFP/TCP session from 192.168.99.102:52641
Mar 03 12:24:33.615900 afpd[859] {main.c:151} (info:AFPDaemon): child[937]: done
Mar 03 12:24:33.662909 afpd[938] {dsi_tcp.c:241} (info:DSI): AFP/TCP session from 192.168.99.102:52642
Mar 03 12:24:33.664773 afpd[938] {uams_dhx2_passwd.c:265} (info:UAMS): DHX2 login: alex
Mar 03 12:24:34.123277 afpd[938] {auth.c:235} (note:AFPDaemon): Login by alex (AFP3.4)
Mar 03 12:24:42.743463 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:42.743495 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:43.505587 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:43.505596 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:43.507262 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:43.511743 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:43.629389 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:43.629401 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:44.010214 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.010223 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:44.220511 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.220520 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:44.227305 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.227311 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:44.335163 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.335171 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:44.357627 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.969843 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:44.969855 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:46.311572 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:46.311583 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:47.335662 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:47.335672 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:47.509969 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:47.509978 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:47.612342 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:47.612350 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:48.616369 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:48.616389 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:48.704837 afpd[859] {main.c:149} (info:AFPDaemon): child[946]: exited 4
Mar 03 12:24:49.044818 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:49.044829 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:24:50.897033 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:53.722263 afpd[859] {main.c:149} (info:AFPDaemon): child[947]: exited 4
Mar 03 12:24:55.786600 afpd[859] {main.c:149} (info:AFPDaemon): child[948]: exited 4
Mar 03 12:24:55.789372 afpd[949] {dsi_tcp.c:241} (info:DSI): AFP/TCP session from 192.168.99.102:52665
Mar 03 12:24:55.791314 afpd[859] {main.c:151} (info:AFPDaemon): child[949]: done
Mar 03 12:24:59.458758 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:59.547413 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:24:59.547426 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:01.920839 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:01.920854 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:01.923982 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:01.923992 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:02.733794 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:02.733807 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:03.300608 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:03.300622 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:04.552474 afpd[859] {main.c:149} (info:AFPDaemon): child[953]: exited 4
Mar 03 12:25:09.558372 afpd[859] {main.c:149} (info:AFPDaemon): child[954]: exited 4
Mar 03 12:25:09.632904 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:09.723010 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:09.723024 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:10.649893 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:10.649905 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:14.653025 afpd[859] {main.c:149} (info:AFPDaemon): child[955]: exited 4
Mar 03 12:25:16.885681 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:16.885716 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:16.886770 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff/."): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:16.894055 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:16.894060 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:17.351214 afpd[938] {ad_open.c:818} (error:ad): ad_header_read_ea("/storage/stuff"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Mar 03 12:25:17.351228 afpd[938] {ad_open.c:1280} (error:ad): ad_open_hf_ea: unexpected: Invalid argument
Mar 03 12:25:19.658381 afpd[859] {main.c:149} (info:AFPDaemon): child[956]: exited 4
Mar 03 12:25:24.676218 afpd[859] {main.c:149} (info:AFPDaemon): child[957]: exited 4
Mar 03 12:25:29.847104 afpd[859] {main.c:149} (info:AFPDaemon): child[958]: exited 4
Mar 03 12:25:57.741669 afpd[938] {auth.c:804} (note:AFPDaemon): AFP logout by alex
Mar 03 12:25:57.743388 afpd[938] {afp_dsi.c:108} (note:AFPDaemon): AFP statistics: 148.80 KB read, 2176.09 KB written
Mar 03 12:25:57.743396 afpd[938] {dircache.c:615} (info:AFPDaemon): dircache statistics: entries: 8011, lookups: 23639, hits: 5248, misses: 18390, added: 17227, removed: 9216, expunged: 1, evicted: 9216
Mar 03 12:25:57.743410 afpd[938] {afp_dsi.c:584} (note:AFPDaemon): done
Mar 03 12:25:57.744122 afpd[859] {main.c:151} (info:AFPDaemon): child[938]: done

@rdmark rdmark changed the title Invalid metadata EA Invalid metadata EA in logs Mar 3, 2023
@rdmark
Copy link
Member

rdmark commented Mar 5, 2023

@Adubzzy I tried to recreate your setup locally on a Debian Bookworm VM, but couldn't reproduce your issue immediately there. Would you say that your shared volumes have quite a lot of accumulated data? Just to estimate the probability that there is some corrupted EA data on your file system. It's weird though that it throws the error on the root of the shared volume (/storage/stuff) and not an a particular file that it should, by glancing at the code.

If you can give me some pointers on how to set up FreeBSD I can give that a go as well.

@Adubzzy
Copy link
Author

Adubzzy commented Mar 14, 2023

Since this only happens on 2 of my shared drives, is there a command to check for corrupted EA data?

@Adubzzy
Copy link
Author

Adubzzy commented Mar 14, 2023

@rdmark Not sure if it would have helped I also ran all of the dbd commands on the 2 shares that throw and error and nothing:

I ran:

8107 sudo dbd -s /usr/home/alex
8109 sudo dbd -s /storage/stuff
8122 sudo dbd -f -t /usr/home/Alex

I gave up after running the last command because I kept getting the error. The '-f' command is supposed to rebuild the entire CNID DB. So I don't know what is causing this. The Timemachine shares don't cause a problem.

I also check that on my Mac I am do have the boolean set to create .DS_Store files

$ defaults read com.apple.desktopservices DSDontWriteNetworkStores
0

@rdmark
Copy link
Member

rdmark commented Mar 28, 2023

@Adubzzy Thanks for the update. We've gotten other reports that dbd isn't working reliably with certain metadata. I'm mostly familiar with netatalk2 and adouble so I can't really provide any more insight at this point.

For context, were this volumes originally netatalk2 volumes that were converted to netatalk3? Or created with netatalk3 from scratch?

@Adubzzy
Copy link
Author

Adubzzy commented Mar 28, 2023

Yes I believe the problematic shares were created originally in netatlk2 and then I upgraded to netatalk3. If this isn't a big deal for anyone else, I don't really think it's a good idea to keep pursuing, I just silenced my logs to print WARN level messages and it stopped complaining. I plan on converting everything to SMB when I upgrade FreeBSD to v13 in a few weeks. That and afp is a dead protocol from apple's side, so if I can get everything running on SMB I'll be fine.

@rdmark
Copy link
Member

rdmark commented Mar 29, 2023

@Adubzzy I think it's still worth keeping this issue ticket open regardless. Converting netatalk2 volumes to netatalk3 is still an active usecase since netatalk2 is still alive and kicking (appletalk support.)

Anyhow, I would recommend using Netatalk / AFP only if you need to network older Macs (Mountain Lion and earlier) or Apple II clients. If all your Macs speak SMB natively then Samba is clearly the way to go. :)

@NJRoadfan
Copy link

I suspect this is the same regression I encountered with the dbd tool after fixes were applied to harden AppleDouble verification. When the dbd tool encounters a malformed AppleDouble header, it can't fix it because the stricter checking causes the utility to terminate parsing. This leads to a catch-22 where the utility that can repair the errors, but can't because the adouble library detects an error and bails.

I would try downgrading to Netatalk 3.1.12, running dbd on your file shares and seeing if that corrects the errors.

@derekmarcotte
Copy link

derekmarcotte commented Apr 6, 2023

If you can give me some pointers on how to set up FreeBSD I can give that a go as well.

Grab an installer from either:

https://download.freebsd.org/ftp/releases/ISO-IMAGES/13.1/ (try FreeBSD-13.1-RELEASE-i386-disc1.iso for CD or FreeBSD-13.1-RELEASE-i386-memstick.img for USB)

https://download.freebsd.org/ftp/releases/VM-IMAGES/13.1-RELEASE/amd64/Latest/

Once installed, as root (or via sudo https://people.freebsd.org/~murray/handbook/security-sudo.html), you can install netatalk with pkg install netatalk3 and configure /usr/local/etc/afp.conf accordingly.

I think you will find this occurs in all the directories in the configured share, regardless of metadata. I had tried dbd -f and other flags as well.

@rdmark
Copy link
Member

rdmark commented Apr 13, 2023

@derekmarcotte Thanks, I have FreeBSD set up in a VM now and running a Netatalk share.

@rdmark
Copy link
Member

rdmark commented Apr 21, 2023

@derekmarcotte I have been running the latest main development code on FreeBSD for about a week now, doing some light copying of files back and forth on macOS Ventura and Mac OS 9.1 clients. No EA errors as of yet. So I still think what we're seeing is some kind of corrupted metadata is the culprit, whether it's caused by failed AFP operations or failed conversion from ad to ea metadata...

Edit: what I meant to say is that it's non-trivial to reproduce with a fresh shared volume.

@rdmark
Copy link
Member

rdmark commented Apr 24, 2023

Site note: One potential workaround is to disable the ad to ea on-the-fly conversion in netatalk3. The afp.conf option is:

convert appledouble = no

@Adubzzy
Copy link
Author

Adubzzy commented Apr 24, 2023

Site note: One potential workaround is to disable the ad to ea on-the-fly conversion in netatalk3. The afp.conf option is:

convert appledouble = no

I am not surprised you are having a difficult time reproducing this issue. As I mentioned in previous posts, this only happens on 2 out of 4 directories I have listed. One is my home directory and another is a miscellaneous storage dir. I also have 2 directories for TimeMachine Backups and they load fine. Also when I create a new test directory it works fine. It seems like something got corrupted and I don't know how to fix them. Even though I took the measures I noted above with the "DBD" command.

If I want to try the convert appledouble = no setting.... Does that go into the global section or individual share sections?

@rdmark
Copy link
Member

rdmark commented Apr 25, 2023

If I want to try the convert appledouble = no setting.... Does that go into the global section or individual share sections?

Please have a look at the man page: https://netatalk.sourceforge.io/3.1/htmldocs/afp.conf.5.html
It's listed under the "Volume options" heading...

@rdmark
Copy link
Member

rdmark commented May 23, 2023

@anodos325 Do you have an idea why the current code throws invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file. but doesn't actually crash? Asking since you were the last person touching the code.

What I see is the AFP_ASSERT(!(ret != 0 && errno == EINVAL)); line that you added, which I think evaluates to AFP_ASSERT(false). Doesn't this have the opposite effect that you intended? Or am I misunderstanding the functionality of AFP_ASSERT?

Edit: So AFP_ASSERT is defined as

#define AFP_ASSERT(b) \
    do {                                                                \
        if (!(b)) {                                                     \
            AFP_PANIC(#b);                                              \
        } \
    } while(0);

...which means that AFP_ASSERT(false) lets the program continue running, doesn't it?

@Adubzzy
Copy link
Author

Adubzzy commented May 23, 2023

I added the convert appledouble = no option to the volumes that give me trouble and still getting errors when log level is set to "error". Not sure why but I didn't include my full config, I added below to see if there is something maybe I am missing.

;
; Netatalk 3.x configuration file
;

[Global]
;mimic model = TimeCapsule6,106
log level = default:error
log file = /var/log/afpd.log
;hosts allow = 192.168.99.0/24
afpstats = yes

[Homes]
basedir regex = /usr/home
follow symlinks = yes
convert appledouble = no

[TimeMachine2]
path = /storage/TimeMachine2
valid users = natalie
time machine = yes
vol size limit = 300000

[TimeMachineLB]
path = /storage/TimeMachineLB
valid users = alex
time machine = yes
vol size limit = 300000

[Stuff]
path = /storage/stuff
valid users = alex natalie
convert appledouble = no

[Torrent]
path = /usr/transmission/home/Downloads
valid users = alex
convert appledouble = no

@Adubzzy
Copy link
Author

Adubzzy commented May 23, 2023

On a side note when I was running the command sudo dbd -cv /usr/home/alex as per the man pages which said it would be better to do a conversion that way. The same files kept popping up as needing conversion. And they were all "dot files" like .zshrc etc. And every time I ran it, it seemed like it was a problem, I also did the sudo dbd -fv /usr/home/alex command to rebuild the CNID backend and it didn't seem to improve the issue.

@rdmark
Copy link
Member

rdmark commented Jul 5, 2023

@Adubzzy and everyone else who regularly experience this issue: I have a PR with additional logging that should help pinpoint a bit better which EA header that fails the validation. #363

If you haven't built netatalk from scratch before I'd be happy to assist.

@rdmark
Copy link
Member

rdmark commented Jul 6, 2023

As @derekmarcotte calls out, and as evident from all the recent logs that I've seen, the issue happens when ad_header_read_ea() is run on directories specifically. I'm not too familiar with what kinds of EA that are valid for directories, but I played around a little with the bundle xattr in macOS and classic attributes such as labels and custom icons via Mac OS 7.5... Alas not getting any closer to reproducing the issue.

For those of you where the error isn't fatal, can you please run xattr -l on an affected dir within macOS and share which attributes that is set for it?

BTW, the "easy" solution here is to partially revert 4140e54 and restore the call to sys_removexattr() in EC_CLEANUP that simply discards the metadata that fails validation. However, I want to try to understand if the offending metadata is actually invalid first. Looking at the code, I think the validation is simply checking the size of the data against an expected size for the attribute type defined in a constant.

@southbay82
Copy link

Site note: One potential workaround is to disable the ad to ea on-the-fly conversion in netatalk3. The afp.conf option is:

convert appledouble = no

This option worked perfectly for me, I was having issues with my time machine share. Thank you!

@starkjs
Copy link

starkjs commented Jul 30, 2023

Ever since I upgraded to 3.1.12 on Ubuntu 20.04 (netatalk:amd64 (3.1.12ds-4, 3.1.12ds-4ubuntu0.20.04.1))
I have been having this very issue. We have 3 Mac's that do a TM backup alternatively to two Ubuntu Servers.

I have been getting 1000's of these errors:

$ awk '/ad_header_read_ea/{print $6}' /var/log/afpd.log | sort | uniq -c
   5668 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Jack"):
    250 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Jack/."):
  17660 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Josh"):
    629 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Josh/."):
  98346 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Kris"):
   1423 ad_header_read_ea("/mnt/SeagateExpansion/TimeMachine-Kris/."):

I have been trying many different things to clean up these errors for the last two months, I have also tired to delete and recreate the DB (in /var/lib/netatalk/CNID/) and still no luck.

Today I thought to try and copy the sparsebundle to a new AFP share to see if it's ok, that seems to have worked.
So I then wonders what the difference was and I tried to just recreate the top level directory and it's worked.

What I did in the end was:

systemctl stop netatalk
mv /mnt/SeagateExpansion/TimeMachine-Josh /mnt/SeagateExpansion/TimeMachine-Josh_error
mkdir /mnt/SeagateExpansion/TimeMachine-Josh
mv /mnt/SeagateExpansion/TimeMachine-Josh_error/* /mnt/SeagateExpansion/TimeMachine-Josh/
rm -rf /var/lib/netatalk/CNID/TimeMachine-Josh
systemctl start netatalk
dbd -s /mnt/SeagateExpansion/TimeMachine-Josh

After doing this, it seems to have stopped reporting the ad errors and the Time Machine backups are all working.
No idea what was going on, I can't understand why recreating the AFP mount path has made a difference.

There must be something different between this new version 3.1.12~ds-4ubuntu0.20.04.1 and the previous version I had
Unpacking netatalk (3.1.12~ds-4ubuntu0.20.04.1) over (3.1.12~ds-4)

Thanks
Josh

@rdmark
Copy link
Member

rdmark commented Jul 30, 2023

@starkjs Thanks for sharing your scenario! Yes, as I demonstrated in another ticket, #357 , Ubuntu recently applied a suite of security fixes which strengthens the validation of extended attribute headers. As has been demonstrated here, and by other reports, there is some kind of extended attribute of the root directory of a shared netatalk volume that is not passing this validation.

If you saved a backup of the malfunctioning volumes before you moved them, I have a PR with some added logging here: #363 . If you're able to compile and run netatalk from the code in this PR, it would help pinpoint which type of metadata is causing the failed validation.

And, please try running getfaddr on the malfunctioning directory as well, which should tell us exactly which extended attributes it has. (May require the "attr" package on Debian and other distros.)

@starkjs
Copy link

starkjs commented Jul 30, 2023

Hi @rdmark,

Ah, thanks for the link to the other ticket, I had not found that one in my searching. That looks very similar, but I am not getting a crash. So that is very curious!

I was able to do the getfaddr and I can see

$ getfattr -d /mnt/SeagateExpansion/TimeMachine-Josh_errors
getfattr: Removing leading '/' from absolute path names
# file: mnt/SeagateExpansion/TimeMachine-Josh_errors
user.org.netatalk.Metadata=0sAAUWBwACAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAEAAAAmgAAAAAAAAAIAAABYgAAABAAAAAJAAAAegAAACAAAAAOAAABcgAAAASAREVWAAABdgAAAACASU5PAAABfgAAAACAU1lOAAABhgAAAACAU1Z+AAABjgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKg5QrCoOUKyAAAAAKg5QrAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
user.org.netatalk.has-Extended-Attributes=0seWVzAA==

If I compare that to the new directory I have created, it looks like the user.org.netatalk.Metadata is different, but not sure if or how it matters.

$ getfattr -d /mnt/SeagateExpansion/TimeMachine-Josh
getfattr: Removing leading '/' from absolute path names
# file: mnt/SeagateExpansion/TimeMachine-Josh
user.org.netatalk.Metadata=0sAAUWBwACAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAEAAAAmgAAAMgAAAAIAAABYgAAABAAAAAJAAAAegAAACAAAAAOAAABcgAAAASAREVWAAABdgAAAAiASU5PAAABfgAAAAiAU1lOAAABhgAAAAiAU1Z+AAABjgAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALFhLjixYS46AAAAALFhLjgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
user.org.netatalk.has-Extended-Attributes=0seWVzAA==

Yep, happy to compile based on your PR, I don't think I am going to get time to do that this week.
I should be able to have a look the week after for you.

Thanks
Josh

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

All, I have a tentative fix for this issue now in PR #363

If you have a moment to spare, I would appreciate if you could test it and provide feedback.

@rdmark rdmark self-assigned this Aug 10, 2023
@rdmark
Copy link
Member

rdmark commented Aug 11, 2023

To explain the theory of the fix, thanks to additional logging and help from the reporter of #368, we found out that the root dir of their shared volume had the four netatalk private AppleDouble entires (PRIVDEV/PRIVINO/PRIVSYN/PRIVID) set but no data in them (zero length). The ad_entry() function had a branch that checks for len == 0 and returns NULL which leads to the assertion further up the stack.

However, reading the AppleDouble spec I could confirm that zero length AD entries are a valid case. The caveat is that the private netatalk entries aren't of course covered by Apple's spec, but our testing so far suggests that zero length private entries are safe.

Now, I first filed my own fix to allow zero length entries in the previously mentioned PR. However, further research made me realize that @Synology-andychen actually filed a fix for this a year ago in #178 so I went ahead and merged his fix to main earlier. My own PR only has a few minor additional fixes in it now to remove a special case for the COMMENT entry amongst other things.

Anyhow, there is still a chance that some of you are actually encountering a different issue than the zero length entry described above, so please be continuously vigilant about post-fix "invalid metadata EA this is now being treated as a fatal error" messages which will require further analysis to figure out what is failing the validity check!

@rdmark rdmark changed the title Invalid metadata EA in logs Invalid metadata EA errors Aug 11, 2023
@rdmark
Copy link
Member

rdmark commented Aug 14, 2023

The Debian Buster netatalk package has been hotfixed now, so if you're running Buster you can do apt update && apt upgrade to get the fixed binary.

The patch will probably propagate to Ubuntu soon, too, although I don't have any insights into their process.

@rdmark
Copy link
Member

rdmark commented Aug 27, 2023

Closing this out now after merging #363 which is a minor change that simplifies the handling of COMMENT entries.

Since the logic changed quite a lot, I've left the assertion in place for one more release cycle. If you all encounter this "invalid metadata EA" error again, this means there's yet another metadata corner case that needs to be addressed. In that case, please raise a new issue ticket and share the full log and backtrace leading up the error.

Once we are confident that all corner cases have been addressed, we will take action in #400 to remove the assertion and allow netatalk to delete invalid entries in the future.

@rdmark rdmark closed this as completed Aug 27, 2023
@rdmark
Copy link
Member

rdmark commented Aug 31, 2023

Note that the netatalk 3.1.15~ds-3 package in Debian unstable has been patched now with a fix for this bug:

$ apt show netatalk
Package: netatalk
Version: 3.1.15~ds-3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

6 participants