Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Events not firing #10

Open
whatupdave opened this Issue · 236 comments
@whatupdave

I'm trying to pin down exactly what's happening here. The specs don't pass, I ran it in debug mode:

creating Makefile
CFLAGS='-isysroot /Developer/SDKs/MacOSX10.6.sdk -mmacosx-version-min=10.6 -mdynamic-no-pic -std=gnu99 -Os -pipe -Wmissing-prototypes -Wreturn-type -Wmissing-braces -Wparentheses -Wswitch -Wunused-function -Wunused-label -Wunused-parameter -Wunused-variable -Wunused-value -Wuninitialized -Wunknown-pragmas -Wshadow -Wfour-char-constants -Wsign-compare -Wnewline-eof -Wconversion -Wshorten-64-to-32 -Wglobal-constructors -pedantic' /usr/bin/gcc -isysroot /Developer/SDKs/MacOSX10.6.sdk -mmacosx-version-min=10.6 -mdynamic-no-pic -std=gnu99 -dead_strip -framework CoreServices -D DEBUG=true -o '/users/dave/code/temp/rb-fsevent/bin/fsevent_watch' fsevent/fsevent_watch.c
fsevent_watch compiled
.
append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path

config.sinceWhen    18446744073709551615
config.latency      0.300000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path

FSEventStreamRef @ 0x1001085c0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path'
   latestEventId = -1
   latency = 300000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0

F
append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path

config.sinceWhen    18446744073709551615
config.latency      0.300000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path

FSEventStreamRef @ 0x1001085c0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path'
   latestEventId = -1
   latency = 300000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0


append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0

F
append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0


append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0

F
append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0


append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0

F
append_path called for: /users/dave/code/temp/rb-fsevent/spec/fixtures
  resolved path to: /Users/dave/code/temp/rb-fsevent/spec/fixtures

config.sinceWhen    18446744073709551615
config.latency      0.500000
config.flags        00000000
config.paths
  /Users/dave/code/temp/rb-fsevent/spec/fixtures

FSEventStreamRef @ 0x1001085e0:
   allocator = 0x7fff709faee0
   callback = 0x100001522
   context = {0, 0x0, 0x0, 0x0, 0x0}
   numPathsToWatch = 1
   pathsToWatch = 0x7fff709faee0
        pathsToWatch[0] = '/Users/dave/code/temp/rb-fsevent/spec/fixtures'
   latestEventId = -1
   latency = 500000 (microseconds)
   flags = 0x00000000
   runLoop = 0x0
   runLoopMode = 0x0



Failures:

  1) FSEvent should work with path with an apostrophe
     Failure/Error: @results.should == [custom_path.to_s + '/']
       expected: ["/users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path/"]
            got: [] (using ==)
       Diff:
       @@ -1,2 +1,2 @@
       -["/users/dave/code/temp/rb-fsevent/spec/fixtures/custom 'path/"]
       +[]
     # ./spec/rb-fsevent/fsevent_spec.rb:30:in `block (2 levels) in <top (required)>'

  2) FSEvent should catch new file
     Failure/Error: @results.should == [@fixture_path.to_s + '/']
       expected: ["/users/dave/code/temp/rb-fsevent/spec/fixtures/"]
            got: [] (using ==)
       Diff:
       @@ -1,2 +1,2 @@
       -["/users/dave/code/temp/rb-fsevent/spec/fixtures/"]
       +[]
     # ./spec/rb-fsevent/fsevent_spec.rb:40:in `block (2 levels) in <top (required)>'

  3) FSEvent should catch file update
     Failure/Error: @results.should == [@fixture_path.join("folder1/").to_s]
       expected: ["/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/"]
            got: [] (using ==)
       Diff:
       @@ -1,2 +1,2 @@
       -["/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/"]
       +[]
     # ./spec/rb-fsevent/fsevent_spec.rb:49:in `block (2 levels) in <top (required)>'

  4) FSEvent should catch files update
     Failure/Error: @results.should == [@fixture_path.join("folder1/").to_s, @fixture_path.join("folder1/folder2/").to_s]
       expected: ["/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/", "/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/folder2/"]
            got: [] (using ==)
       Diff:
       @@ -1,3 +1,2 @@
       -["/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/",
       - "/users/dave/code/temp/rb-fsevent/spec/fixtures/folder1/folder2/"]
       +[]
     # ./spec/rb-fsevent/fsevent_spec.rb:61:in `block (2 levels) in <top (required)>'

Finished in 13.4 seconds
5 examples, 4 failures

I'm running ruby-1.9.2-p180
Mac OSX 10.6.6

I'm not sure what other information would be helpful. If i build the debug binary and run it on a directory i'm not seeing any events firing.

Any ideas?

@whatupdave

Ok after filing this I checked apple update and updated to Mac 10.6.7 which has fixed the problem of file system events not firing.

@ttilley
Collaborator

CRAZY! I just read the email github sent about the bug and was worried I forgot to double-check everything in xcode3.latest or something silly. ;)

I'm glad that fixed things for you, but it's worrisome that there might be a configuration out there that just flat out doesn't work. shrug

@ttilley ttilley closed this
@whatupdave

I'm not 100% if it was the mac version or something else going on in the system. I rebooted a few times to see if it made a difference. I did install windows boot camp the other day so I guess my beautiful mac will never be the same....

@rmoriz

having exactly the same issues. running 10.6.7 with all patches + Xcode 4.0.2

events get fired by OSX as http://www.fernlightning.com/doku.php?id=software:fseventer:start shows them

could not find a way to fix this. tried several different rubies (1.8.7, 1.9.2, 1.8.7EE) without success.

@rmoriz

However v0.3.10 works (at least specs):

(in /private/tmp/rb-fsevent)
/Users/rmoriz/.rvm/rubies/ree-1.8.7-2011.03/bin/ruby -S bundle exec rspec ./spec/rb-fsevent/fsevent_spec.rb
No examples were matched by {:focus=>true}, running all
creating Makefile
fsevent_watch compiled
....

Finished in 13.27 seconds
4 examples, 0 failures
@ttilley ttilley reopened this
@ttilley
Collaborator

rmoriz: there are two fsevent APIs. one is private, fairly low level, and essentially allows for a userspace application to insert itself in the middle of I/O events in the kernel... which allows not-well-behaved code to cause serious problems. This is the API used by spotlight, and apparently also fseventer.

The public API for FSEvents is based on a daemon that makes use of this much lower level API to log a much less detailed version of events, trimmed down to the directory level, under the /root_of_that_volume/.fseventsd/ directory.

Now... the most confusing detail of what you're seeing is that 0.3.10 works and 0.4 does not. This is different from anyone else I've heard from so far has been seeing: you either have issues that break use of FSEvents by any and all applications (minus spotlight, and cheaters that use undocumented APIs), or issues that break FSEvents when working within a specific volume.

I recently heard from a user who disabled spotlight, re-enabled it, rebooted twice, and POOF... magic. Things worked again. I'm still a bit frustrated I wasn't able to figure out what was going on before his machine just started working again out of nowhere, and thus don't understand the solution to what he was seeing. However, more out of superstition than logic (as spotlight uses /dev/fsevents, not fseventsd), you might want to go to preferences -> spotlight -> privacy, add your root filesystem, reboot, remove it, reboot. You have no idea how painful it is for me to even suggest such a thing.

Before hopping down that rabbit hole, however, it's worth re-trying 0.4 if you're seeing 0.3.10 work. With the changes involved it makes very little sense for you to not be seeing events. I can see there being compounding issues of all kinds that might stop them from reaching you, but not that fsevent_watch itself isn't seeing them. Please re-compile with debugging as well.

export FWDEBUG="true"
gem install rb-fsevent

...Note to self: make that a commandline option.

@rmoriz

@ttilley that was me on twitter ;)

btw. 0.3.10 didn't work, only the tests were all green, sorry for the confusion. As far as I saw the 0.4 released added more tests.

The strange thing was, that other fsevent based apps worked but rb-fsevent did not. First I thought maybe the noatime mount option (b/c of my SSD) could be a problem. I remembered that I've disabled spotlight indexing on the main partition. After removing the partition form the ignore list the indexer started. I rebooted 2 times because of other things…

Then rb-fsevent (used in guard) started working perfectly out of the blue. The rb-fsevent tests got green!
I've now disabled spotlight indexing again, rb-fsevent still works.

tl;dr

  • re-active spotlight indexing so mds starts
  • reboot => rb-fsevent may work now.
  • disabled the spotlight indexing again

BTW: I've updated my Xcode to the 4.0.x release. That alone did not help but maybe it's part of the solution? (System was already 10.6.7 when I started)

@ttilley
Collaborator

oh. well. alrighty then!

Definitely good to know that we have since improved the quality of the test suite so that it doesn't give false passes... Especially since I wasn't aware of them. A few of the modifications between 0.3.10 and 0.4 were actually done to prevent false failures. Both may have had the same cause.

@ttilley ttilley closed this
@lox
lox commented

I just had exactly the same issue. Very strange, nothing had changed in terms of ruby, rb-fsevent or any other software on my system, my sync script just stopped getting events. I could see events with fseventer, but nothing was coming through to my script. OSX is 10.6.7.

I tried rebooting twice, no result. I excluded the root partition from indexing, rebooted and everything is working again.

@lox
lox commented

Unfortunately the problem re-surfaces once I remove my root partition from the Spotlight exclusion list. Frustrating.

@quackingduck

Same issue here. Adding the root partition to the spotlight exclusion list and rebooting fixed the problem

@lox
lox commented

Actually, I was able to add just the directories in question to the exclusion list and it works fine.

@bquorning

On OSX 10.6.7, I have absolutely no luck making rb-fsevent work. Tried adding my "code" folder to Spotlight’s exclusion list and rebooted the machine. Still doesn’t work.

@ttilley ttilley reopened this
@ttilley
Collaborator

can you all check each volume for /.fseventsd/no_log ? Also, make sure each volume has a /.fseventsd/fseventsd-uuid file containing a UUID of the format:

ABC01D2E-F345-6ABC-D7E8-F9AB01C234D5

Since I'm not using the volume specific API, it might also just blindly be using the root filesystem settings... That's certainly possible.

Also check to make sure there's an as-root process running called fseventsd (or, more specifically, /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/Support/fseventsd).

I guess also make sure there's a /dev/fsevents ? Looks like it should be mode 0644 with major,minor of 12,6393236.

Also, grep /private/var/log/system.log for 'fseventsd'. Note that "destroying old logs" messages are harmless, and simply mean that the filesystem was modified without that specific fseventsd knowing about it (like... booting into your lion partition on the same drive and having its' spotlight index your snow leopard partition... or what have you). Example:

./system.log:Jun 17 15:09:38 Travis-Tilleys-MacBook-Pro fseventsd[20767]: event logs in /.fseventsd out of sync with volume.  destroying old logs. (1 170038 170050)

I really really wish I knew what the problem was, specifically, for you guys so that I could write in a check for it and display a helpful warning. Any information is invaluable.

@bquorning

Just tried it on my home computer: it works fine. Running the “Singular path” example code from the README, I now see fsevent.run happily printing a line when a Dir is changed.

My work computer (which I reported rb-fsevent not working on yesterday) just returned when calling fsevent.run. Maybe pipe.eof? returns true right away? Anyhow, I’ll try debugging per your instructions above when I get back to my work machine on monday.

@robwilliams

I came across this issue the other day and noticed it may be due to case sensitivity of directory names in the home folder.

All my projects were stored in:

/Users/Rob/Projects

I wanted to tab complete without typing the uppercase 'P' so I renamed projects to:

/Users/Rob/projects

Great so now I can tab complete.. Hold on a minute Guard has stopped working.

Renaming it back to Projects fixed it.

I tried creating a spec to catch this error but it passes without issues. :/

  # This only happens in the users home directory from what i can tell
  # so to prove it we test both
  {
    :in_home_directory     => Pathname.new(File.expand_path('~')),
    :in_fixtures_directory => Pathname.new(File.expand_path('../../fixtures/', __FILE__))
  }.each { |directory_description, path|

    it "should work with folder that has been renamed from Uppercase to lowercase #{directory_description}" do

      uppercase_path = path.join("FseventCasetest")
      lowercase_path = path.join("fseventcasetest")
      uppercase_file_path = uppercase_path.join("watch.txt")
      lowercase_file_path = lowercase_path.join("watch.txt")

      # Make sure fixtures are clean
      FileUtils.rm_rf(uppercase_path)
      FileUtils.rm_rf(lowercase_path)

      FileUtils.mkdir(uppercase_path)

      FileUtils.touch(uppercase_file_path)

      # Use OS X mv that is case sensitive and allows renaming of a directory
      # to its uppercase equivalent in the same parent directory
      `mv -v #{uppercase_path} #{lowercase_path}`


      @fsevent.watch lowercase_path.to_s do |paths|
        @results += paths
      end

      run
      FileUtils.touch lowercase_file_path
      stop

      File.delete lowercase_file_path
      # Make sure fixtures are clean
      FileUtils.rm_rf(uppercase_path)
      FileUtils.rm_rf(lowercase_path)

      @results.should == [lowercase_path.to_s + '/']

    end
  }
@ttilley
Collaborator

The subset of events involving directory renames (while running) aren't -actually- handled properly, since the communication format between the C subprocess and ruby library doesn't allow for it. This is only an issue when renaming a directory that's explicitly watched (not a subdirectory). I was determined to fix that and a number of other things (see the "coming soon" comments in the readme), but, erm... I got distracted. ^^;

You know, I actually have a fair chunk of time today and throughout this week to devote to the work necessary to check off those TODO items. I guess it's time to get motivated and bang that out...

An aside for @thibaudgg - I intend to use tagged netstrings (probably) as a serialization format. It honestly took an attempt at serializing to JSON in C to make me realize I was being silly creating so much extra work for myself.

@bquorning

Before I got around to debugging per @ttilley’s request above, I found the following: Using the irbtools gem (version 1.0.4 here, haven’t tested with others) on ruby-1.8.7-p174 (ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]) causes rb-fsevent (calling fsevent.run per the README) to return instantly. Using ruby-1.8.7-p334 [ x86_64 ] or ruby-1.9.2-p180 [ x86_64 ] works just fine with irbtools.

Not requiring irbtools makes rb-fsevent work just fine on ruby-1.8.7-p174 here.

So, that looks like a bug right there. (Though maybe not on this gem…)

@ttilley
Collaborator

twitch

@bquorning

My problems aren’t over yet: Now rb-fsevent does not work on my work computer anymore. Not in 1.8.7 (any patchlevel) or 1.9.2. With or without irbtools, no difference.

  • I have a /.fseventsd folder. (root:admin mode 700)
  • I have no /.fseventsd/no_log file.
  • I do have the /.fseventsd/fseventsd-uuid file with a proper-looking UUID. (root:admin mode 600)
  • The process /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/Support/fseventsd is running
  • There is a device /dev/fsevents (root:wheel mode 644)
  • grep fseventsd /private/var/log/system.log reveals nothing.

This is on OSX 10.6.8, 2.4GHz Intel Core 2 Duo. Let me know if you need more information.

@bquorning

I just tried running the README sample code from an unsaved TextMate file. Here, Dir.pwd is "/private/var/folders/ds/[something very long]/-Tmp-/" – and rb-fsevent works!

Now, why doesn’t it work when watching my home dir? And how can I help you debug this issue?

@ttilley
Collaborator

...wtf? so it works under /private but not outside of /private? can you try other folders that are under /private/?

@bquorning

Today’s round of debugging leaves me equally baffled…

  1. Tried watching /private (works), /Users/benjamin (works!) and /Users/benjamin/code (doesn’t work).
  2. I tried making a new subfolder /Users/benjamin/kode and watching it as first “kode”, then “Kode.” Both work.
  3. Suspecting that my code folder’s name (“code”) was the problem, I renamed the folder and watched it. Yay, now it works.
  4. Renamed back to “code” to confirm that the problem were indeed the folder’s name. Argh (but yay, I guess), it still works.

Concluding: I don’t know what the problem was, but changing the folder’s name to something else, and back, seems to have solved the problem.

@ttilley
Collaborator

That actually says quite a bit. When you rename a folder, that doesn't change the inode that data is kept in... just metadata. So, technically, there wasn't a new 'thing' to watch. It's not the folder itself that's a problem, OR that it's under the specially-handled OSX 'private' folder where exceptions are made in the realpath logic.

[09:13:08][~]$ stat -x code
  File: "code"
  Size: 68           FileType: Directory
  Mode: (0751/drwxr-x--x)         Uid: (  501/ ttilley)  Gid: (   20/   staff)
Device: 14,2   Inode: 5577795    Links: 2
Access: Fri Jul 15 09:12:43 2011
Modify: Fri Jul 15 09:12:43 2011
Change: Fri Jul 15 09:12:43 2011
[09:13:17][~]$ mv code kode
[09:13:24][~]$ stat -x kode
  File: "kode"
  Size: 68           FileType: Directory
  Mode: (0751/drwxr-x--x)         Uid: (  501/ ttilley)  Gid: (   20/   staff)
Device: 14,2   Inode: 5577795    Links: 2
Access: Fri Jul 15 09:12:43 2011
Modify: Fri Jul 15 09:12:43 2011
Change: Fri Jul 15 09:12:43 2011
[09:13:29][~]$ mv kode code
[09:13:40][~]$ stat -x code
  File: "code"
  Size: 68           FileType: Directory
  Mode: (0751/drwxr-x--x)         Uid: (  501/ ttilley)  Gid: (   20/   staff)
Device: 14,2   Inode: 5577795    Links: 2
Access: Fri Jul 15 09:12:43 2011
Modify: Fri Jul 15 09:12:43 2011
Change: Fri Jul 15 09:12:43 2011

Note that the inode (where it's kept on disk, the thing being watched) hasn't changed. It remains 5577795. Also, the access/modify/change times have stayed the same throughout. ALL that has changed is a small chunk of metadata in the inode referring to the directory.

I mean, it's not a solution, but it IS a significant amount of information.

@ttilley
Collaborator

It's good to know we're not the only ones seeing fsevents do crazy things of a similar nature -> http://lists.apple.com/archives/filesystem-dev/2011/Jan/msg00000.html

Note that his testing scenario resulted in the file creation event in one scenario, but not another (where the file was deeper in the tree before the multiple-level-ancestor directory was renamed)... and at least for time machine, both cases resulted in an inaccurate backup.

Fun times.

@volkanunsal

I'm having this exact same problem. The event are not firing and as a result Guard, which depends on rb-fsevent, does not work on my system. Hopefully a solution can be found for it.

@muddymatches

Upgrading to Lion fixed this for me.

@ttilley
Collaborator

@tenaciousflea can you perform the checks I mentioned where I re-opened this bug?

I wish I could ssh into one of your machines or something to poke around... in a shared screen session perhaps. As of yet, I have been unable to reliably reproduce this and so I can't dig into what might be causing it in depth. >_<

@ttilley
Collaborator

Actually... I live on east coast USA. If any of you still having this problem are ok with letting me shell in and poke around, I'm often on irc.freenode.net as either Aphelion or ttilley (though not as often non-idle). Alternatively, we could set up screen sharing over ichat, which would be nice since we could talk at the same time.

I really really really want to stomp this bug flat and be done with it.

@volkanunsal

@ttilley I can offer to help you using my computer. I looked up your post from when you reopened the issue, but I'm not proficient enough with shell programming to do the tests. If you give me brief snippets of code to run, I can run them and tell you what they produce.

Is there a chatroom you go to often? I'm on the East Coast as well, and I can be available over the weekend.

@ttilley
Collaborator

I tend to idle in: #ruby-lang #ruby #macruby #rubinius #jruby

@thibaudgg
Owner

You can also use #guard (irc.freenode.net).

@ttilley
Collaborator

...I just realized it's 7am and I haven't gone to sleep yet. So I hope you're not a morning person.

I have, however, converted fsevent_watch into its own xcode project, replaced the extconf.rb fakeout build system with a rakefile that calls xcodebuild, performed some cleanup, did extensive testing to ensure that this works as expected in 10.6+10.7 using xcode 3.2.6, 4.0.2, 4.1, and 4.2b5 (4.1+ for lion), refactored some C, added a hidden command-line argument to enable file-level events in lion for giggles, and introduced a system of pluggable output formats with the eventual goal of exposing extensive metadata on each event fired.

Graph here: https://github.com/thibaudgg/rb-fsevent/network

Not bad for a sitting. I think.

@volkanunsal

Awesome. I'm trying to reproduce the problem at my home computer. Just the day before I was seeing the issue both on my home and my work computer. Then I upgraded to Lion at both places -- which meant upgrading XCode too. Yesterday when I left office, I was still seeing the problem on my work computer, but at my home computer it has...evolved. At least I think it's no longer something to do with rb-fsevent. The problem has moved up to the guard-coffeescript. I am able to see the events are detected on my computer, but guard-coffeescript is not compiling javascript files. Well, at least that's something.

I'm going to have to report back on how the office computer is functioning on Monday. I already posted on guard-coffeescript issue log to let them know about this.

@ttilley
Collaborator

crap. there goes another chance to figure out, exactly, what the problem is and write a check for it. just like everyone else who reports this bug you did something and then it magically went away.

@volkanunsal

Seems updating to Lion solved it for me as well. It may have been the XCode update, not Lion by itself, though.

@ttilley ttilley closed this
@ttilley
Collaborator

...I'm going to just close this issue. I can't reproduce it and everyone who reports it also reports it going away after doing X where X varies. I'm strongly debating a pre-compiled binary approach and am doing some heavy refactoring of fsevent_watch.

@andreyvit

Guys, I just wanted to chime in to say that LiveReload 2 users are experiencing this exact problem too, most with folders inside Dropbox. I'm 100% sure it's an OS bug. I generally recommend the create-new-folder-and-move-everything approach, and it's yet to fail for anyone (others reported that rebooting solved the issue).

So when you detect a failing folder /foo/bar/boz, you do

cd /foo/bar
mv boz boz.0
mkdir boz
mv boz.0/* boz/*
rmdir boz.0

Ditto for /foo/bar and /foo, if needed. Problem will be solved.

I have no idea what the cause is, but I have plans to include automatic detection of this issue into LiveReload 2.

@volkanunsal

@andreyvit

In my case, I figured out the folder name was the cause of the problem. The folders had . and ( characters in their names. Moving to folders without those characters solved my problem. The example you give seems to validate my conclusion.

@andreyvit

@tenaciousflea Um… which example? /foo/bar/boz was the original name in the example, and it's not even a real name. (This happened to me with perfectly regular folders. In fact, it used to happen on ~/Dropbox/Projects/livereload itself. :)

@volkanunsal

@andreyvit

In that case, the mystery continues.

@ttilley ttilley reopened this
@ttilley
Collaborator

This bug makes me want to stab myself in the face.

If I can poke at a machine/setup where this problem actually happens that'd be great. Every person who has offered has ended up doing something before I get there (install something, rename folders, upgrade xcode, upgrade macos) and the problem goes away before I can take a look.

@gvt

I am experiencing this on OSX 10.7.1 with rb-event version 0.4.3.1. Its a real pain.

I am finding that autotest-fsevent does work (0.2.5).

@ttilley
Collaborator
@volkanunsal

@ttilley

This started happening on my computer again after a routine firmware update. I haven't tested it thoroughly to understand its symptoms just yet, but when I moved the folder to the desktop, the issue went away. If you want to sync up today (or tomorrow) to look at my system, that would work for me.

@ttilley
Collaborator
@ttilley
Collaborator

it was massively helpful to poke around on a machine where this was consistently breaking, and i'm now sure i have a clearer picture of the bug. @andreyvit - ping in case you're interested (or have more info than I do here).

In this particular instance of the bug, the folder 'Xperiments' was returning 'xperiments' via realpath as well as apple's path resolving apis. After renaming the folder to 'xperiments' and then back to 'Xperiments', those apis returned the expected with-caps name. Unfortunately I accidentally fixed the bug on the machine before getting the kind of data I wanted, and I have no idea how to directly parse the fseventsd data logs, but I'm guessing the events were being produced using the with-caps version of the name and thus watching for the without-caps version will miss them entirely.

In short, the bug is that realpath() doesn't give a correct result all the time for case-insensitive-but-preserving HFS+ volumes.

@ttilley
Collaborator

i'd also like to note that stat -x was also returning unmodified whatever path it was given. if called on the with-caps path, that's what it gave as a name. same for without-caps.

@ttilley
Collaborator

...we are of course back to the scenario where I don't have access to a machine where the bug is occurring, so I can only blindly try things and hope it works. if realpath() isn't returning the correct result, readdir() might not either (though i'm hoping this is unlikely, since ls shows the correct case).

anyways, the idea is to use realpath() or similar and then manually check each directory node for the correct case for that child. slightly painful, but at least it only needs to happen once... when registering a path to watch.

@andreyvit

@ttilley Great findings. Does actually explain why recreating folders fixes this problem. And btw I have info that 10.7 users also still experience the problem, so it's not fixed in OS X. Will try to play with this new info too.

@ttilley
Collaborator

the person who let me poke around was running 10.7.1 actually.

It seems that FSCopyAliasInfo() reliably returns sane results, but again... I no longer have access to a machine where the bug still happens. On the plus side, FSAliasInfo also contains a signature of the containing volume's filesystem type... so checking for this bug as well as the "high latency + macfuse" bug can be done in one go (sshfs can fire when you start writing, but not again when you finish, if the write operation takes more than @latency time).

@ttilley
Collaborator

more specifically, calling FSCopyAliasInfo with /users/ttilley/desktop gets us (yes, i named my lion install volume "rawr"):

targetName                 = 'Desktop'
volumeName                 = 'rawr'
pathString                 = '/Users/ttilley/Desktop'
volumeCreateDate           = 0.3394049572.0 (Wednesday, July 20, 2011 7:32:52 PM Eastern Daylight Time)
targetCreateDate           = 0.3394050005.0 (Wednesday, July 20, 2011 7:40:05 PM Eastern Daylight Time)
isDirectory                = YES
parentDirID                = 281309
nodeID                     = 281467
filesystemID               = 0x0000 ('??')
signature                  = 0x482b ('H+')
volumeIsBootVolume         = YES
volumeIsAutomounted        = NO
volumeIsEjectable          = NO
volumeHasPersistentFileIDs = YES
@andreyvit

Meanwhile, made 2 tools to investigate this:

@ttilley Can you pls look at https://github.com/andreyvit/find-fsevents-bugs/blob/master/find-fsevents-bugs.c, does it look like it would detect the bug you're talking about? I've scanned my Dropbox folder, and so far zero results. Running on the whole disk now.

@andreyvit

Actually here's a snippet:

realpath(path_buf, real_path_buf);
if (0 != strcmp(path_buf, real_path_buf)) {
    output("Found: '%s' != '%s'\n", path_buf, real_path_buf);
    ++results;
}

Any better ideas?

@ttilley
Collaborator

bit too literal:

[17:42:02][find-fsevents-bugs] (master u=)$ ./find-fsevents-bugs /Users/ttilley/Desktop/
Found: '/Users/ttilley/Desktop/' != '/Users/ttilley/Desktop'
Done, 1 result(s).

other than that, I hope so... though checking my ~/.rvm dir takes 30 seconds. :/

I'd like to stress that I haven't a clue whether or not readdir() will give the expected entry if realpath() gets it wrong and I don't have a place to test the theory. If your beta testers could be a resource here that would be glorious.

...I also have never experienced this bug myself on my own machine, so I can't even imagine the conditions with which it is triggered. Dropbox isn't responsible, as not everyone who reported the issue have used it. It does seem to exacerbate the problem though, or at least make it more likely for these conditions to occur.

HEY. Here's an idea: does syncing a mixed-case directory/file between machines result in a broken/inconsistent inode? Make directory 'MixedCase' on machine1, test for issues on machine2? It'd be nice if that ended up being a consistent method of reproducing this bug, but I know not to get my hopes up.

@ttilley
Collaborator

You might also enjoy using my updated fsevent_watch binary with debugging enabled, as it's fairly descriptive: https://github.com/ttilley/fsevent_watch/blob/master/fsevent_watch/main.c#L75

...or my MacRuby port, which is significantly easier to read, and feature-filled as fuck (TM): https://github.com/ttilley/mrb-fsevent/blob/master/lib/mrb-fsevent/flags.rb

@andreyvit

Can you btw give a snippet that calls FSCopyAliasInfo? I'm lost in FSRefs and AliasHandles, and don't really have a burning desire to read all docs. :)

Here's an idea: does syncing a mixed-case directory/file between machines result in a broken/inconsistent in ode?

I can't see why it would be.

I've tried saturating FSEvents with looped renames, but that goes nowhere.

@ttilley
Collaborator
FSRef itemRef;
AliasHandle itemAlias;
HFSUniStr255 targetName;
HFSUniStr255 volumeName;
CFStringRef pathString;
FSAliasInfoBitmap returnedInInfo;
FSAliasInfo info;

// get an FSRef here -_-

FSNewAlias(NULL, &itemRef, &itemAlias);
FSCopyAliasInfo(itemAlias, &targetName, &volumeName, &pathString, &returnedInInfo, &info);

I'm working off of the "FSMegaInfo" sample code since it can be a lot less intimidating reading sample code than massive list-every-possible-combination reference docs. ;)

@ttilley
Collaborator

oh yeah, anything you don't care about having returned can be NULL instead of a pointer and it will just skip that part. so in theory you can just get the pathString. i think.

AND it requires CoreServices.

...AND you can only get an FSRef for a file that exists, so you can't use it for watching something that has yet to be created. The pattern for that is using an FSRef for the parent directory and a unicode name string, according to the docs. Not very useful if multiple sections of your to-watch path don't exist yet, which should be possible. :/

...simplest path-to-fsref seems to be to create a CFURL for the path and call CFURLGetFSRef

@ttilley
Collaborator

really, I don't think we're going to make any progress until after one of your beta testers runs your find-fsevents-bugs app or someone else chimes in here saying they're experiencing the bug. we have no way to test anything.

@ttilley
Collaborator

as an aside, my AIM handle is YourTravis and i'm often on irc.freenode.net as either ttilley or Aphelion. if you want to bounce ideas, or if anyone on this bug is still experiencing it (anyone), i'm around and would like nothing better than destroying this bug and never hearing another word about it again.

@Latency
@andreyvit

I don't have anyone who's still having the issue either. The next person to report it will be treated with great care :)

@ttilley
Collaborator

@latency - you can remove yourself by clicking "disable notifications for this issue". I accidentally added you by using the ruby instance variable syntax for @latency outside of a code block. I do apologize... and thought that correcting it in an edit might un-do the notification, but this apparently wasn't the case.

The URL is: #10

I unfortunately can accidentally subscribe you, but not intentionally unsubscribe you.

@ttilley
Collaborator

@andreyvit - HFS+ volumes store file names in a funky variant of unicode that attempts to normalize character representations via a different-across-os-releases decomposition algorithm. A subtlety of this algorithm is that besides case-insensitivity, there are also unicode characters that are ignored completely when comparing file names. Case sensitive HFSX volumes, however, do NOT ignore said unicode characters (in addition to being case sensitive). On HFS+, "forwards" and "sdrawrof" (written as printed, not stored) are equivalent filenames.

God I'm glad I'm not a filesystem developer.

In finder, I have a filename that reads !באמת. The filename is 9 bytes. When I ls in iterm, it prints as באמת!. When I ls in Terminal.app, I get the form used in finder. Additionally, when I start IRB in iterm, Dir[*] displays filenames in a completely different order than if I do the same in Terminal:

# iterm
> Dir['*']
 => ["באמת!", "☃"] 

# Terminal
> Dir['*']
 ["☃" ,"!באמת"] <=

However, they both print the same result when realpath is used (with the exclamation point at the end). This is another solid example of multiple APIs returning different results. Perhaps this oddity could be used to simulate the conditions that trigger this bug, even if we don't know the exact conditions that would do so otherwise... and perhaps that's just wishful thinking on my part.

For reference, the actual bytes for that filename are: "\xd7\x91\xd7\x90\xd7\x9e\xd7\xaa\x21"

@ttilley
Collaborator

I REALLY wish I got the byte representation of the directory causing this bug when I had the chance, but I didn't think of it. It's certainly possible that renaming it changed this representation.

@ttilley ttilley was assigned
@ttilley
Collaborator

fuck it. I burned a dev support ticket and linked to this issue. Hopefully support can enlighten us (though it might take them a while to do so... I didn't purchase above-basic support).

@ttilley
Collaborator

Every simple method of path resolution I know ignores case for the file path part. I have a file named LikeYoMothaFuckaWeee.txt, and if I call realpath() on it in lowercase, the last path component is in lowercase. NSString's fileSystemRepresentation does no better. Neither does stringByStandardizingPath or stringByResolvingSymlinksInPath. Similarly for the methods on NSURL.

BUT... converting a file path NSURL to a file reference path and back results in the correct case for all components every time. Unless someone has a better answer, I think that's what I'm going to use.

[23:20:24][MiXeDcAsE]$ irb
irb(main):001:0> t = Dir['*'][0]
=> "LikeYoMothaFuckaWeee.txt"
irb(main):002:0> t.downcase!
=> "likeyomothafuckaweee.txt"
irb(main):003:0> t = File.realpath(t)
=> "/Users/ttilley/Dropbox/testing/MiXeDcAsE/likeyomothafuckaweee.txt"
irb(main):004:0> u = NSURL.fileURLWithPath(t)
=> #<NSURL:0x400220c20>
irb(main):005:0> u.description
=> "file://localhost/Users/ttilley/Dropbox/testing/MiXeDcAsE/likeyomothafuckaweee.txt"
irb(main):006:0> u.fileReferenceURL.description
=> "file:///.file/id=6571367.13518787"
irb(main):007:0> u.fileReferenceURL.filePathURL.description
=> "file:///Users/ttilley/Dropbox/testing/MiXeDcAsE/LikeYoMothaFuckaWeee.txt"
@andreyvit

@ttilley To clarify, LiveReload does not call realpath (or anything similar) before starting monitoring. So it was using the same name that Choose Directory dialog returned.

Also, regarding Unicode chars, again I had this problem on latin-only names like ~/Dropbox/Projects/Active/livereload (as far as I remember, the whole ~/Dropbox/Projects was broken back then). And it happened on just one of my computers — I was using two back then, and on the other one all folders were perfectly monitorable (so the problem is not getting replicated by Dropbox).

@agibralter

I'm having the issue too -- I'm using guard with rspec and rb-fsevent in a app directory located on dropbox. The problem of not detecting changes used to happen on my old MacBookPro running the latest version of 10.6 and rvm+REE; however, it never seemed to happen on my iMac (in the same synched dropbox directory). Now, the problem has appeared all of sudden on my new MacBook Air running the latest version of Lion (in that same dropbox directory) running rvm+ruby1.9.2 and rvm+ruby1.9.3-rc1. I tried running Disk Utility's verify/fix permissions on my Macintosh HD (as I've seen in other posts about this problem) to no avail. I also tried moving the directory (within dropbox) and it didn't seem to help either.

@andreyvit

@agibralter Your broken directory is a very valuable asset. If you don't mind us looking at your system, please don't try to fix it!

Feel free to say hello either on IRC mentioned above, or on LiveReload Support web chat (Campfire): https://andreytarantsov.campfirenow.com/f2839.

@ttilley
Collaborator

I won't be around much of the day... please contact andrey if at all possible. He has as much enthusiasm for seeing this bug die, and might even be more capable at debugging it. ;)

I'd certainly love to see you run his find-fsevents-bugs app to see what it returns.

@ttilley
Collaborator

@agibralter - I also want to stress that your broken directory is very helpful for us. We can't reliably reproduce this bug and must depend on users who report it in order to test theories on fixing it.

@ttilley
Collaborator

@agibralter - after running andrey's helper tools, i'd love for you to install the pre-release rb-fsevent gem (0.9.0.pre1) to see if the problem persists with that version. I switched to using the AliasInfo method of resolving paths.

@andreyvit

@ttilley If you are around, please join us in Campfire, and I'll hook you up with remote desktop session too.

@andreyvit

@agibralter Thanks a lot for letting us in. We found that:

  1. 'Broken' folder is ~/Dropbox/Foo/Bar, latin characters only, nothing fancy.
  2. Readdir and realpath return ~/Dropbox/Foo/Bar, while FSCopyAliasInfo returns ~/Dropbox/Foo/bar (note the lowercase bar).
  3. We've tried fseventsmon on ~/Dropbox/Foo/Bar, ~/Dropbox/Foo/bar, ~/Dropbox/Foo/Bar/subfolder and ~/Dropbox/Foo/bar/subfolder. No changes detected by either of those.

I think the bottom line is:

  • we have a good way to detect this case
  • we have no way to continue monitoring until the problem is fixed
  • we seem to have a way to fix the problem (just rename the folder and then rename it back).

We did not actually try fixing it, and I've asked agibralter to leave it as is for now (and make a copy for further work), so we can try something else if there are any ideas.

I believe the best way forward is to auto-detect and maybe to offer auto-fixing.

@ttilley
Collaborator

Sorry I had to duck out in the middle like that.

...I'm also insanely curious to hear what dev support might say when they respond.

Did you get a byte representation of the file name? If not, in ruby 1.9.x:

# where num_in_list is the array index for that file/directory
Dir['*'][num_in_list].bytes.map {|b| '\x' + b.to_s(16)}.join('')
@andreyvit

@ttilley: No, I did not get byte representation, I'm quite sure it is just a regular English name. And yes, dev support may be a good way forward on this. I think I have 2 support incidents with my Mac dev program, will try to use one. (But everything after this point is probably pure curiosity. I think we have a good plan of detection and recovery.)

I have an idea to set up a VM with OS X, install Dropbox and see if we can get some broken folders after initial sync.

@ttilley
Collaborator

I installed dropbox on my 32bit mac running snow leopard just in case there's an oddity in syncing back and forth and I noticed that directory renames don't sync if the only change is in case. renaming 'test' to 'TEST' doesn't sync. you need to rename a file for it to think it's sync-worthy.

@ttilley
Collaborator

@andreyvit - True. I think I'll add code that resolves via realpath, copyalias, and for 10.6+ resolving file reference urls (where it just has volume id and inode number and determines the path from that). If they're not all the same... print super helpful error message and quit. for my use case, it doesn't seem like auto-fixing this would be helpful.

Perhaps we should link everyone who has the bug here to add their comments: http://openradar.appspot.com/10207999

@agibralter

Ok so on my iMac rb-fsevent (0.4.3.1) works just fine at detecting changes in ~/Dropbox/Foo/Bar. When I ran find-fsevents-bugs on the iMac I get this:

15:18 [~/Dropbox/Foo/Bar]  
$ ./find-fsevents-bugs /Users/aarongibralter/Dropbox/Foo/Bar/
Found (realpath): '/Users/aarongibralter/Dropbox/Foo/Bar/' != '/Users/aarongibralter/Dropbox/Foo/Bar'
Found (FSCopyAliasInfo): '/Users/aarongibralter/Dropbox/Foo/Bar/' != '/Users/aarongibralter/Dropbox/Foo/Bar'
Done, 2 result(s).

Whereas when I run it on my macbook air (where rb-fsevents doesn't detect changes) I get thousands of results (Done, 3332 result(s). to be precise).

My iMac is running 10.6.8 and my MBA is running 10.7.1.

@agibralter

Also, on both computers, Finder and ls show that Foo is capitalized. It may be that when Dropbox creates folders and files on a computer, it does it strangely... the files were all on my iMac to start.

@agibralter

And one more thing -- it used to work on my macbook air. It just stopped working a couple days ago and I'm really not sure why. I can remember what I could have done to change anything.

(sorry for the spamming!)

@ttilley
Collaborator

Response from apple developer support:

Travis

I'm responding to your report of a problem with FSEvents not firing properly.  You wrote:

> While the bug is occuring, realpath()/readdir() and FSCopyAliasInfo()
> return paths with different case.

Yeah, that's most suspicious.  I suspect that this problem has nothing to do with FSEvents per se, but rather that there's case
insensitivity edge case deep within the kernel (either in HFS Plus or in the VFS name lookup cache).

> I have an openradar for this: http://openradar.appspot.com/10207999

Thanks for that.  Given that you only filed this yesterday, I'm going to give it a couple of days to wend its way through
the system, after which I'll see if kernel engineering has any suggestions as to why this might be happening.  I'll
send you another update soon (probably in the second half of next week).
@agibralter

Interesting. Is there any way to handle the edge case of incorrect capitalization in FSCopyAliasInfo in the mean time?

@ttilley
Collaborator

oh yeah, there's a way to detect the scenario in which the problem is likely to occur. one could even auto-rename folders without asking if you're ok with that level of intrusion (might make sense for an application like livereload, but not a generic library like rb-fsevent). there is not, however, a way to simply fix the issue wholesale.

@agibralter

Well I think renaming is a bit too intrusive... but couldn't rb-fsevent just watch out for that situation, print a warning, and then monitor the corrected directory?

@ttilley
Collaborator

@andreyvit - the final response from apple is that this isn't a bug because the fsevents table is case sensitive. i think the point of the bug report, being that path resolution apis weren't necessarily returning the correct case for a path, was completely ignored (so client and fseventsd can still have different paths). re-submitting...

in the meantime, i added file reference url resolution as the default 10.6+ strategy. no idea if that'll be more consistent. sigh.
ttilley/fsevent_watch@e249e0f#L1R49

@ttilley
Collaborator

The equivalence of what is happening is:
1. fsevents_tool -files ./test
2. mv ./test ./bar
3. echo "" > bar/test

Nobody should or would expect events to arrive on test/test.

The user workaround is as follows:

  1. Do not rename watched paths. (mv /User/foo/bar /Users/foo/BAR)
  2. Watch the parent directory instead of the directory you wish to rename.

/me slams head into desk repeatedly

@andreyvit
@agibralter

Would you still like to keep my directory in science experiment mode? Or can I move things around? :)

@andreyvit

Well, personally, I would be super-happy if you could download LiveReload 2 app (http://livereload.com/), add your directory there and see if it triggers the error. If you choose to do so, you can report the result to support@livereload.com to avoid spamming people here.

Other than that, there hasn't been any development on this topic, so I guess no — feel free to delete/move it. Thanks!

@ttilley
Collaborator

@agibralter - my bug report has stalled. 100%. after clarifying the issue, any questions about the status of the bug go ignored. the utility they mention in their bug reproduction example (for the incorrect bug) doesn't actually exist, at least publicly. sending the raw contents of /.fseventsd/ would be a potential risk for you, and painful for us, as it would include binary representations of all events, coalesced to a degree. I don't know of an easy way to look at the raw HFS+ inode data alongside fork data to determine potential on-disk issues.

The only thing I can think of is something that using the underlying internal-only fsevent device API that fseventsd itself uses, and censoring that yourself to only include data you want to share. Something like: http://www.osxbook.com/software/fslogger/

As far as I can guess, there should STILL be an fsevent notification somewhere... it's just incorrect in regards to case. If you can include ls output of the directory, stat -x of the directory, and a snippet of output from something like fslogger while editing files under the broken directory, that might be enough info to push forward the bug report with apple.

@ttilley
Collaborator

...not that it'd particularly help us much, but i'd really like to see this fixed in macos.

@ttilley
Collaborator

First bug report was March 30, 2011. Last bug report was October 25, 2011.

The majority of reporters had a problem with a path underneath a dropbox sync point. Dropbox doesn't have a UI for performing updates. Growl 1.3 was released on the mac app store around November 9, breaking dropbox notification support. An updated version of dropbox was available not long after, but that may have been the first time a lot of people had ever updated. It's a bit suspicious to me that bug reports stopped coming in right when a bunch of people had a reason to update dropbox... which doesn't use the public fsevents api, instead opting to use the private api.

Given the details and timeframe I'm happy to consider this bug unresolvable and dead. Unresolvable because it's looking likely that the issue isn't with macos or my own software, and dead because I haven't had any recent reports and I'd really rather stab myself in the face than spend more time on this.

...Since I was desperately waiting to resolve this issue before making another release, expect one in the hopefully not too distant future that also resolves the issue with installing without xcode.

@ttilley ttilley closed this
@andreyvit

Hehe, I've recently had two users encounter this bug (for whom LiveReload has detected it and referred them to me). Both on Dropbox. Of course, I have no idea if they have updated Dropbox recently or not. (Does it even require manual updating? I don't think it ever asked me about updates.)

@michaelstalker

I know this is closed, but I recently ran into this issue in my project folder. I have Dropbox installed, but my project folder is not in Drobox. I renamed my project folder from "sites" to "Sites" and everything started to work fine. It still works after renaming it back to "sites".

@ttilley
Collaborator

@mstalker - frustrating, isn't it? i originally thought this was completely unrelated to dropbox because the problem wasn't always under the dropbox folder, but it does appear that dropbox (specifically /Library/DropboxHelperTools/Dropbox_VERSION/dbfseventsd) is still involved in triggering the state where this problem occurs. I'd be lying if I said I fully understood the scenario, but upgrading dropbox has improved the situation dramatically. If you don't mind, can you look and see what version is being run? Locally I have u501.

I recently discovered that dropbox for linux can also cause problems on that platform. How it manages to do that, I have no idea. Compass.app has a bug report that FS notifications (via inotify) simply don't work on ubuntu under the dropbox folder.

@codepodu

I just faced this issue (LiveReload warned me) and I don't even have Dropbox installed :|

I don't use it, and have never on my mac (latest Lion)

What do you say about that?

@ttilley
Collaborator
@codepodu

Sure, you can inspect it. Is it okay to move the files and just keep the empty folder? I'll contact you on IRC.

@tablatom

Wow, reading this thread is painful. I do hope ttilley hasn't stabbed himself in the face yet.

Another poor soul having this problem (Lion). Renaming /Users to /users fixed it, yesterday. Today it is broken again.

Anything I can do to help? I've compiled and run find-fsevents-bugs, but not sure how much light that can shed.

Oh! Update: Renamed every folder in my path to all lower and things seem to be working again. Who needs caps anyway?

@andreyvit

@tablatom The case does not matter by itself, it's the act of renaming that (sometimes) fixes the problem, so you probably want to rename everything back (esp /Users)

By the way, here is a list of steps that, so far, has always helped everyone with this problem. (LiveReload can detect the specific offensive folder; feel free to ask me for a free version if you encounter the problem.)

@andreyvit

@ttilley BTW, it's about time we create a Wiki page about this. Forcing people to find this obscure (and closed!) ticket, and then to read one year's worth of comment is unnecessarily cruel. Feel free to copy text from my support page and/or to link to it.

@tablatom
@karl-petter

I just started using Guard but never got it to react to file changes. After several hours of research I found out about this issue and it seemed to be what I was experiencing. Followed the thread for a bit, checking that I had no_log file, no system.log entries etc. Then I saw that people had solved it by renaming folders and scrolled down to the end of it.

So after checking my paths I saw that when starting Guard it reports Guard is now watching at '/users/kalle/coding/rails/community/trunk' but my actual path is Guard is now watching at '/Users/kalle/coding/rails/community/trunk', i.e. the Users folder was reported with a downcased U.

I started thinking if I could have detected this more easily because reading through the thread I started trying to rename different folders. I looked back into my shell history and I saw that I had made pwd before I fixed the issue and it reported the correct path!(I think I did this before the fix, not 100% sure). So to make it easier for anyone stumbling into this again and just want to make a quick test I wrote this short ruby script that I think shall detect it and it will report where it sees a difference.

current_dir_ruby = Dir.pwd
current_dir_system = `pwd`.strip # strip to remove any carriage returns at the end
if(current_dir_ruby != current_dir_system)
  puts "*******************************************************************************"
  puts "Seems you are affected by the bug odd Mac OS bug described here"
  puts "https://github.com/thibaudgg/rb-fsevent/issues/10"
  ruby_paths = current_dir_ruby.split("/")
  system_paths = current_dir_system.split("/")
  path = ""
  system_paths.zip(ruby_paths).each do |sp, rp|
    path = path<<sp<<"/"
    if(rp!=sp)
      puts "Ruby gives wrong folder name for #{path}"
      puts "Rename that folder will probably fix your problems"
      puts ""
    end
  end
  puts "*******************************************************************************"
end

Hope it can help anyone!

Btw, wouldn't it make sense to open this issue again? Even if its not an issue with rb-fsevent it is very likely users will stumble into it again.

@thibaudgg
Owner

@karl-petter I think it's more something that could be handle by Listen.
What do you think of that @ttilley @Maher4Ever ?

@karl-petter

@thibaudgg I do not really understand what you mean? As I have understood it this is an OSX bug and has nothing to do either with Listen or rb-fsevent but users of rb-fsevent are of course affected. Doesnt it affect Listen too?

The solution so far seems to be to rename the affected directory but the problem is to find which one. So I just wrote that little script to more easily see that(if I'm correct that pwd reports the correct path while Dir.pwd the faulty one) Since I fixed my problem before writing the script I have not been able to test it:(

@andreyvit

@ttilley @thibaudgg Let me ask it this way: why doesn't rb-fsevent have code to detect the problem and point people to a URL with an explanation? (Does Listen have it?)

@andreyvit

Also note that if you run LiveReload 2 and add an affected (sub)folder to it, LR will tell you the exact name of the buggy folder. I think I already said as much in this thread. Beta versions are free to run; there's no downside (like a trial period or whatever), and you can simply trash the app afterwards.

The relevant code is in OldFSTree.m and Project.m, btw, feel free to reuse it under the MIT license.

@ttilley
Collaborator
@karl-petter

I was again affected by this bug and got the opportunity to test my script, it works. So its a good alternative if people do not want to install a piece of software to see if they are affected or not. I'll see if I can figure out what is causing this, right now I suspect I get affected each time I restart my machine, which is not very often though. Guard worked fine yesterday but not this morning and I made a shutdown of the laptop when I left work yesterday.

@pvdev

Wow. I searched all over the place for issues with directories magically changing to all caps and nothing. Then while looking to see why I should have installed rb-fsevent with guard, I come across this thread. Many hours have been spent trying to figure out why a guard 'watch' regex would suddenly stop working. I finally saw the reason with the Dir.pwd in a rib session. Thought it was because I was using SugarSync on OSX, but I see dropbox user's have seen it also. Tried moving the whole project directory to normal disk space e.g., /tmp, but still had the all caps in check with irb. Interestingly, it only breaks guard and running rspec manually continues to work. With me it first happen at the root of my SugarSync branch, but since I've seen it in project level directories. I've also notice when the problem is occurring I get the same Dir.pwd output with the rvm/ruby or system ruby.

Enough of a rant.
--Perry

@andreyvit

I wish I could be at WWDC right now and point some OS folks to this thread. Maybe next year.

If anyone can reliably reproduce the problem or has ideas on how to reproduce it, please speak up, by the way.

(Funnily, though, I haven't heard about this bug from LiveReload users lately. I used to get a couple of reports per month, so I have an impression that it's getting better, for an unknown reason.)

@molfar

Is there any result of this issue? How can be it fixed? I have this bug now.

@molfar

I have solved path case problem in my path. First of you need to check difference between ruby's Dir.pwd and bash pwd output in project's dir. In my case, I had difference:
Dir.pwd => /User/Myname/Documents/project
pwd => /User/myname/Documents/project
In this case the problem is in Myname dir's name, not in project's dir name. Renaming project dir has no effect. But renaming my user's folder to Myname and back to myname - has effect.
So, I think readme should bu updated, to not only trying renaming project's dir, but at first comparing output of these commands and finding the problematic dirname.

@ryanstout ryanstout referenced this issue in strongloop/fsevents
Closed

Events not firing (tried start also) #21

@ryanstout

I'm having what I think is the same issue, except that pwd and Dir.pwd show the exact same thing for me, but the find-fsevents-bugs shows the directories as being a problem. It will show one with a / at the end and one without. Is there a work around if I have an affected directory? I tried using other directories and it seems like as soon as they are used it causes the issue.

@andreyvit

Public ann: @bdkjones (the author of CodeKit) took his shot at getting Apple Developer Support fix this problem. So far, the progress is mixed, but we did reveal some additional data.

We would again very much like to poke around on a machine with a broken folder. If you are affected by this bug and can afford to leave the folder unfixed for a few days and can let us screenshare into your machine, please ping me!

@bdkjones

After days of research, I have narrowed down the cause of this issue. It IS Apple's fault and it IS something they have to fix; there is no workaround for us.

FSEvents is case-sensitive. It treats /Users/folder and /Users/Folder as two separate paths. Moreover, when you start an FSEvents stream, the path you pass in is "standardized" to the value that realpath() returns. However, this value does not always match what the kernel writes to /dev/fsevents, which is the exact root cause of this bug.

Here is a screenshot of my ongoing conversation with Apple Developer Technical Support:

screen shot 2015-05-12 at 21 11 50

@bdkjones

Apple, quite idiotically, has not yet acknowledged that this is something they should fix. Even if they do acknowledge that, it could be years before a fix makes it into OS X.

What We Need

In the meantime, I propose that we focus on ways to make the kernel report events using the same path that realpath() returns. So, from the example above, the kernel is reporting events for /Folder/database but realpath() has that path as /Folder/Database.

Since FSEvents always uses the realpath() variation, there must be some sequence of actions we can take (programmatically) on Database that will cause the kernel to start using the same capitalization as realpath().

We just need to discover what those actions are. I have asked Apple DTS for ideas. But it SHOULD be possible to "fix" broken folders somehow.

@andreyvit

@bdkjones This is all that's needed:

BrokenFolder="/some/path/to/database"
ParentOfBrokenFolder="$(dirname "$BrokenFolder")"
NameOfBrokenFolder="$(basename "$BrokenFolder")"
cd "$ParentOfBrokenFolder"
mv "$NameOfBrokenFolder" "$NameOfBrokenFolder.broken"
mkdir "$NameOfBrokenFolder"
mv "$NameOfBrokenFolder.broken"/* "$NameOfBrokenFolder.broken"/.* "$NameOfBrokenFolder/"
rmdir "$NameOfBrokenFolder.broken"

...properly implemented in native code and adjusted to preserve the permissions and other attributes of the broken folder.

@andreyvit

...My problem is that I'm not very comfortable doing that to the /Users folder, and also, as we may have figured, my detection method has false positives. Btw, would be nice to hear your latest thoughts on those false positives — the public would greatly benefit from a reliable way to detect this case.

@bdkjones

I would definitely have reservations about using the above method on /Users. First, you'd have to be running as root (which immediately rules out App Store apps). Second, if anything went wrong you'd nuke the user's entire Mac. Third, even if this process completed 100% correctly, it's impossible to say what side effects would happen. Nothing on the system expects the /Users folder to temporarily disappear... it just seems like stuff we can't even imagine would flip the hell out.

This is true of other common folders like ~/Documents and ~/Dropbox, etc. Who knows what apps are using files that are located in those folders and what those apps will do if those files momentarily move as the root folder is replaced?

What I'm wondering is if there's not some less brute-force method that will achieve the same result. Modifying the inode data, changing permissions and then changing them back, some stupid-low-level HFS call... anything like that.

@andreyvit
@bdkjones

Yea. I've asked Apple DTS for their recommendation on a programatic way to "restore" the folder so that the kernel writes events to /dev/fsevents using the same casing as realpath(). I'll report what I hear from them. Hopefully the engineers can give us a simple, robust way to do this.

As for the false-positives issue: I have no idea. I just ran your tool and passed in my home directory, using the same casing that you see in the Finder. It found 13,800 broken folders. Interestingly, all of them appear to be ones that are somehow exposed to "networking". The new photos library, a bunch of crap in an iCloud-related folder in ~/Library, etc. Given that most folks who report the FSEvents bug to me are using Dropbox and other network-syncing clients and the folder most often affected is Dropbox, I'm guessing that whatever is corrupting the filesystem info to produce these "broken" folders more often affects those folders that are somehow exposed to networking.

This does not, however, explain how folders like /Users become affected.

@bdkjones

So I grabbed the open source version of realpath() that Apple has released here: http://www.opensource.apple.com/source/Libc/Libc-498.1.7/stdlib/FreeBSD/realpath.c

I put this in my app and called it. The returned path matched the casing that:

  1. The kernel is writing to /dev/fsevents
  2. All of Apple's standard file-access APIs return.
  3. The alias manager returned.

So, clearly, the open source version of realpath() is not what Apple is actually using in OS X. Because when I call the system's version of realpath() I get a result with one path component capitalized, even when I pass in a path that has that component lower-cased. And there is nothing in the open source implementation that would change the case of a letter.

@bdkjones

I'm going to start writing a replacement for FSEvents. It shall be called FuckFSEvents and it will read directly from /Dev/fsevents in a case-insensitive manner—like a goddamn sane person would.

@andreyvit

@bdkjones Wait a sec, that's a very old version of realpath.c; my understanding is that it simply doesn't do anything to the path that you pass, short for resolving symlinks and relative references.

I've previously sent you a different version of realpath from OS X 10.10, and it works in a completely different way, using getattrlist system call to obtain the actual name of the folder (which performs a vnop_getattrlist call into the non-open-source HFS+ driver).a

Now here's a question (which was why I spent time disassembling the FSEvents binary): if you force FSEvents to use a different implementation of realpath, will it cure the problem? I believe that the answer is yes, and I think that's an avenue worth exploring before you go writing a root tool.

@bdkjones

Yes—if we could get FSEvents to watch the path as returned by the alias API or any of Apple's regular file path APIs, it would solve the problem. realpath() is returning capitalized Database as the watched folder name. Everything else is spitting out lowercase database.

So either realpath() is wrong, OR there is a bug down in the kernel or HFS+ driver. It is possible that realpath() is the correct path and that everything else is wrong.

But the Apple engineer has explained to me that, as far as the kernel is concerned, case-sensitivity does not matter. /Folder/Database and /Folder/database are the exact same path in kernel space. So I have no idea which one of these things is messed up.

@andreyvit

Here is a bit of code that I just wrote that intercepts all calls to realpath from the current process, including the ones made by FSEvents. Feel free to plug in your other implementation and see how it fares on your system. It uses mach_override from mach_star.

@bdkjones

How do you suggest we force fsevents to use a different implementation of realpath()?

And writing my own daemon has a bunch of great advantages:

  1. I don't need any of the "historical" crap that fseventsd provides. (the persisted record of events that Time Machine uses.) I just need events that are happening right now.

  2. The kernel provides the PID of the process responsible for an event, but the FSEvents API does not expose that information. We could use this to ignore unwanted changes (like a branch-switch in Git).

  3. The events as they are written to /dev/fsevents are REALLY nicely organized. Example: when a file is renamed or moved, the old path and the new path of the file appear in the SAME event. If you use the FSEvents API, you only get the old path. Discovering the new path is a bunch of work.

  4. Using GCD, it's easy to build a daemon that reads from the kernel's I/O stream without blocking, then send event information to an app using the XPC framework.

@bdkjones

Cool. I can give that a go in the morning (it's 2AM here), but I imagine it won't work because the call to realpath() isn't happening in my process; it's happening in the fseventsd process.

We'd need something like method-swizzeling, where the call to the function is overridden in ALL processes, system-wide.

@andreyvit

because the call to realpath() isn't happening in my process; it's happening in the fseventsd process

I cannot say whether the call is also happening in the daemon, but I saw multiple calls to that function from within the FSEvents framework binary (i.e. in-process).

@andreyvit

Updated FSEventsFix.c to include the BSD implementation of realpath and to call it properly.

I confirm that:

  1. LiveReload still works.
  2. The hooked realpath gets called from FSEventStreamCreate.

Now you only need to drop 4 files into your project, call FixFSEvents() from main() and tell us if it worked or not.

@andreyvit

One more confirmation: if I change my realpath to always return an uppercase path, FSEventStreamCopyPathsBeingWatched reports the uppercase path as the one being monitored, and no change events are reported.

At this point, I expect you'll witness a miraculous cure tomorrow. :D

The up-to-date code is now in my features/fix-fsevents branch under mac/FSEventsFix, in particular, here's FSEventsFix.c. This will stay current with any further updates I make, until I marge and delete the branch.

@andreyvit

I've updated the code with the latest version of mach_override (unfortunately, it's no longer just 2 files, but the previous one had a bug). I've also attempted to use dyld interposing and Facebook's fishhook for overriding, but without success so far.

@bdkjones

Son of a bitch. The broken folder on my Mac has suddenly stopped being broken. I'm looking for a new one that's still broken to test the mach_override approach.

@bdkjones

Confirmed. The mach_override approach solves the problem.

Here is what happens without the override applied:

screen shot 2015-05-13 at 16 09 25

And once the override is applied, here is the result:

screen shot 2015-05-13 at 16 17 08

@bdkjones

The trouble is that we can't ship the mach_override hack. The buggy version of realpath() does a bunch of stuff that's desirable, such as crossing mount points, etc. The old BSD version does not do these things.

My guess is that the root of this bug lies in the way Apple's newer realpath() function calls down into getattrlist to retrieve the name of each path component. That seems to be the smoking gun.

@andreyvit
@ttilley
Collaborator
@andreyvit

Made two more changes:

  1. Replaced my realpath implementation with the one from OS X 10.10 — it does everything except replacing names.
  2. Got fishhook hooking to work — it's significantly safer than mach_override (it's even compatible with iOS App Store).

I'm very happy with how this works now, and definitely shipping this.

@ttilley, @thibaudgg Are you guys comfortable shipping a fix like this (overriding realpath implementation process-wide)? What do you need to include it? Should I extract the code into a separate repository?

@andreyvit

Here we go: andreyvit/FSEventsFix. Now it's just two files, and you really only need one of them (FSEvents.c). Include the file into the app or library, and the dreaded FSEvents problem will be fixed.

No public API defined or symbols exported, so no concerns about possible name clashes. Multiple copies of FSEventsFix can co-exist in a single process, so it can be included into a library without reservations.

@bdkjones

Nice work! What did you have to do to get Fishhook to work correctly?

Also: what do you think about the ability to toggle the replacement in and out? 95% of the time, there's no need to replace realpath() with our "fixed" version. So, to be safe, what if we only forced the replacement when we had to? We have the ability to detect whether a path is broken (just run Apple's realpath() and do a case-sensitive compare against the path obtained from an FSRef alias). So what if, every time we add a path to an FSEvents stream, we first check to see if it's broken and, only if it is, THEN we swap the functions?

This has the major advantage that once Apple fixes the problem, our code simply stops swapping the functions. No update required.

@andreyvit

What did you have to do to get Fishhook to work correctly?

Remove a leading underscore in the symbol name. For some reason they're comparing imported symbols starting with the second character. :-)

So, to be safe, what if we only forced the replacement when we had to?

Probably a good idea, although I would question whether the added complexity is worth it.

Pros: like you've said, it's marginally safer, and it will auto-disappear when the bug gets fixed.

Cons: we have to continue running realpath against every single folder (or maybe even file, who knows) in the project, I was looking forward to removing that particular code. We have to also add the logic to re-try monitoring after we detect a broken folder.

@bdkjones
@andreyvit

No, we only have to run the check on the folder we tell fsevents to watch. Child items of that folder don't matter.

Are you sure about that? I could swear I had a case where a subfolder was broken, but I also assumed it was common knowledge. Now that you question that, I'm no longer sure if it happened.

Perhaps as an optional way to use the fseventsfix library?

Sure, I'll add it.

@bdkjones

I'm sure. I just tested it over here. I added the parent folder of CloudDocs to my app after verifying that CloudDocs is still broken. FSEvents without the fix applied returned events just fine for files inside the broken folder.

What the FSEvents daemon probably does is check each event that gets written to dev/fsevents by seeing if the event's path starts with any of the paths we tell the stream to watch (in case-sensitive way).

So there's no need to check child folders/files.

@bdkjones

Also: I dunno how I'd have figured any of this crap out without a broken folder on my Mac. Debugging this without having an example case to test with is impossible. I can see why Apple refused to acknowledge it as a bug for years.

@bdkjones

The real kicker is that this isn't even a bug in FSEvents. (Although I STILL am arguing with Apple Engineering that FSEvents ought to be case-insensitive, like, oh, EVERYTHING ELSE on OS X.)

So I suppose we all need to file a new Radar against realpath(). If I write one up and post it on OpenRadar, could everyone here file a duplicate copy? I hear that doing that makes Apple prioritize the bug report a bit more.

@andreyvit

FSEvents ought to be case-insensitive

This isn't necessarily true; OS X does support case-sensitive installs, and iOS always uses a case-sensitive file system. I'd say that without the realpath bug FSEvents works pretty well as it is.

I dunno how I'd have figured any of this crap out without a broken folder on my Mac.

Yep, that was a real strike of luck for all of us.

could everyone here file a duplicate copy?

Sure. I can even file a duplicate DTS ticket. :-) My dev program renewal time is approaching fast, and I still have 2 unused tickets. Let's prepare a good comprehensive guide on this bug and file away. @bdkjones, how about summarizing all our findings on a GitHub wiki or gist so that we can edit & extend it over time?

@bdkjones

OS X does support case-sensitive installs

Yes. But not by default. 99.9% of the Macs out there are running case-insenstive installs. And, FSEvents should, in my opinion, match the behavior of whatever volume I open a stream on. But, the Apple guys tell me there are many edge cases (such as composed character sequences in non-English languages) that make implementing that a nightmare. To which my reply is, "You have $200 Billion Dollars in the bank. I think you can tackle it."

how about summarizing all our findings on a GitHub wiki

Will do that this evening and post a link here. I will also upload all the tools required to do the same debugging that I did, so Apple can replicate the process on their machines. I may even record a freaking screencast to walk through it, as long as I still have a broken folder on my Mac.

@andreyvit

as long as I still have a broken folder on my Mac

Talk about a scarce resource. :-)

Of course, it would be super-awesome to be able to reproduce the bug on demand. Everything points to just having a ‘folder with a lot of changes’ (network sync), but I once tried that and couldn't get the bug to happen. Maybe it's something else — maybe it's ‘changes while power napping’, or smt like that.

@bdkjones

it would be super-awesome to be able to reproduce the bug on demand.

Agreed. Dropbox and syncing clients do seem to aggravate the problem. But I don't have Dropbox installed on my Mac and the two folders that are broken for me are buried in the new Photos app data file and iCloud in ~/Library.

Worse, the broken folder in the Photos data file just magically repaired itself yesterday, after being broken for days. And I didn't take any explicit action to repair it. In fact, I purposefully avoided launching Photos or repairing my disk/permissions, etc. to avoid losing the broken folder.

@thibaudgg
Owner

@andreyvit @bdkjones if you think you can fix that directly in rb-fsevent please don't hesitate to submit a pull request I would merge it with joy :dancers:

@bdkjones

VICTORY!

Apple has acknowledged this as a bug!

They want a radar on it immediately and they want to correct it quickly because it is, technically, file-system damage.

@bdkjones

As requested, here is the Wiki page describing everything we know, with links to all the required tools to diagnose and debug the problem:

https://github.com/bdkjones/fseventsbug/wiki/realpath()-And-FSEvents

@bdkjones

@andreyvit Another excellent reason we need to implement the function-swapping only when required is that it's very, very likely we'll never know exactly when Apple ships a fix to this problem. They aren't going to put this type of low-level change in a set of release notes and no one is going to email us.

If we don't swap the functions only when needed, the only way we'll have to determine when the fix is no longer needed is by shipping a version of our apps without the swapping and waiting to see if anyone reports broken folders anymore.

@bdkjones

Plus, Apple may enhance realpath() down the road, or change the way it operates entirely if the system ever moves off of HFS+, etc. At that point, once they've fixed this issue, we don't want to be swapping the functions any longer—it would probably break things.

@andreyvit

@bdkjones Have you filed a new RADAR yet? I think it makes sense to dupe it verbatim.

@andreyvit

After some thinking, I've changed how FSEventsFix works. Now it requires you to call FSEventsFixInstall() to install, and also provides FSEventsFixIsBroken(path) to check if a folder is broken.

Btw, I've realized that all the FSAlias stuff was basically returning exactly the same value we pass in, so I've removed it from the check.

I've also added versioning via FSEventsFixVersion. The current version is set to 0.9.0, although I intend to set it to 1.0.0 right after @bdkjones confirms he's happy with the library.

Exposing public symbols is not actually a problem because FSEventsFix will be used either as part of an application (under a single developer's control), an external helper tool (e.g. fsevents_watch in rb-fsevents) or a dynamically-loaded library (e.g. a Node.js native module).

@bdkjones

The Apple engineer has responded and has given me a very serious warning about replacing the realpath() function.

He has requested that I not post screenshots of our conversation, but I will summarize his warnings here:

  1. It is not possible to replace realpath() for just FSEvents. If we swap it out, we're doing so for ALL system frameworks. While FSEvents make work correctly, there's no telling what other frameworks will do.

  2. Non-ASCII file names. The original BSD function assumed only ASCII. The replacement realpath() function was designed to handle non-ASCII filenames (which may be one reason they added the low-level calls down to the filesystem). There is no telling how our replacement realpath() might deal with non-ASCII characters.

  3. HFS+ is one of the only filesystems to permit hard-links to directories. This was not accounted for in the original BSD realpath().

Kevin's Suggested Workaround

Kevin suggests that, once we identify a broken folder, we call rename() (the low-level API) to force both HFS+ storage structures to update their values and bring the lettercasing in sync. I am testing this approach now to see if it solves the problem. I will post results when I have them.

@bdkjones

Okay, I've confirmed that using rename() will repair a broken folder as long as you first rename it to something temporary and then back to the correct name.

I've updated my Wiki entry with full details and exact steps, including additional information from Kevin: https://github.com/bdkjones/fseventsbug/wiki/realpath()-And-FSEvents

@andreyvit

@bdkjones First, we're not replacing with the original BSD realpath, we're replacing it with exactly the same version the system uses, just without the kernel call.

Second, if we wanted to, it is of course possible to replace the implementation only for FSEvents, because the only call to realpath that matters to us happens synchronously from FSEventStreamCreate.

Third, let's not succumb to paranoia. Our implementation of realpath works fine, and I don't foresee any reason why the system frameworks would have a problem with it. I'm almost sure that for the purposes of our app, even the original BSD implementation of realpath would work just fine.

And finally, they're wrong about rename, it will not always repair a broken folder. For a number of users, yes, but for a lot more users it does not help, that's one of the certain things about this bug.

@andreyvit

@thibaudgg I'll send a pull request once we come to some sort of conclusion here, but meanwhile, don't you think that it's a bit too active to be a closed bug? :-)

@ttilley
Collaborator

Fair enough. ;)

@ttilley ttilley reopened this
@ttilley
Collaborator

It's unfortunate that any possible fix to this, at the moment, will be a hack with its own downsides and possible negative effects. How would you only duck punch the call to realpath() via FSEventStreamCreate()? You'd permanently hook FSEventStreamCreate() with another hook that hooks realpath(), performs the original call, and then removes the hook before returning the result?

Hey bro, I heard you like method swizzling, so i swizzled your method so you can swizzle your method from within your swizzled method. (sorry, couldn't help myself)

@andreyvit

@ttilley Actually I do like method swizzling, and generally I'm not at all afraid of shipping stuff like that. Adding a bit of maintenance cost, yes, but in this case it's clearly acceptable.

There are four options now:

  1. Hook permanently, but use a global flag to choose between our and original implementations. Set the flag when calling FSEventStreamCreate, reset afterwards.
  2. Hook permanently, but examine the call stack at run time to choose the implementation.
  3. Hook for the duration of FSEventStreamCreate, and then unhook.
  4. Hook permanently and consider it a gift to all the other code that could also be affected by the bug.

Generally, I'm not afraid of shipping stuff like that. Years of production experience. Have I ever told you a story of software RAID for RAM? Sometimes the way to get the job done is to inject code into another process, scan the memory for certain machine code patterns and replace it with a modified code, and then combat memory corruptions (caused by an erratic host process) by keeping multiple copies of your state, comparing the copies on each access. That is a bit of a dirty trick, but it's in production and happily used by a huge enterprise for a critical line-of-business application.

I don't see the realpath fix as unfortunate or undesirable, globally, so I'm leaning towards option 4 for my app. I'm open to having option 1 or 3 available in the API, though.

I do have a concern, though, that sometimes the path passed in may not be in a correct case to begin with, especially if the path may be coming from the command line (like in rb-fsevents case, or in case of LiveReload's URL scheme API). I'm thinking to change how our realpath works to attempt to restore the correct case by enumerating the directories and finding case-insensitive matches for path components. As a bonus, this would also more closely match the behavior of realpath.

@bdkjones
@bdkjones

Also: I think that the reason rename() failed for some folks is that they renamed a subfolder of a broken folder rather than the broken folder itself. I can't confirm that, of course, but the rename() operation going through a temporary folder name physically unlinks the filesystem object representing the old folder and creates a new object representing the new folder. This results in an update to both underlying HFS+ catalog stores.

rename() should always work, provided that it targets the actual broken folder, which may not be the last folder of a given path.

@andreyvit
@bdkjones

Agreed.

But what I'd like to ship is this process:

  1. Find a broken folder by comparing NSURL to realpath().
  2. Implement a rename() if possible (unless it takes root access)
  3. Repeat the check between NSURL and realpath() to verify the folder is fixed.
  4. If the folder is still broken, use your library to swap the realpath() function before creating the FSEvents stream.
  5. Once the stream is created, unhook the function and swap back in the original realpath().

Oh, and if we can't pull off rename() because doing so takes root access, fall back on the function swapping.

This way, we're taking the safest approach when it's at all possible and falling back on the function-swapping only as a last resort. And, when Apple fixes the root cause, this code will still be 100% fine without any updates from us—it won't be needlessly renaming folders or swapping functions. The extra overhead involved in checking a handful of folders won't even be noticeable by users.

@andreyvit

All right, the new version of FSEventsFix allows the flow that Bryan has suggested, see the updated README. For the fixing part, it tries to do a rename into itself, which doesn't return an error, but I'm not actually sure if it does anything. I don't think I'll be making a lot of further changes; @bdkjones, feel free to extend _FSEventsFixAttemptRepair to your liking.

I don't do NSURL stuff when checking for broken folders, but if you really want to, you can pass the path through NSURL before calling FSEventsFix APIs.

(I've traced the internals of [NSURL initWithPath], which calls into CFURLCreateWithFileSystemPath, which is a huge function with tons of code paths, but for a normal absolute path it seems to simply create a file system representation of the path, presumably in a way similar to [NSString fileSystemRepresentation], and then create a URL from the file system representation by adding percent escapes. So using NSURL doesn't seem like a hugely valuable addition for me, besides, LiveReload's internals use NSURL almost everywhere, so the path will already be coming from that source.)

@ttilley
Collaborator

Since realpath() was wonky, I made rb-fsevent run CFURLCreateWithFileSystemPath(), chop off imaginary paths until I hit an existing path (being called with a path that doesn't exist yet is valid), convert to volId/inodeId pair, and ask what the system thinks the path for that should be (using CFURLCreateFileReferenceURL() followed by CFURLCreateFilePathURL()). This also should drop down to the filesystem to ask what the path should be, but I have no idea what calls are made in the background and I have nothing broken to test against.

I guess it might be worth adding some ObjC to the project just to make sure i'm going through a codepath documented to be "correct".

@thibaudgg - i'm thinking of dropping support for pre-10.9 in the main binary if i'm going to be adding a workaround. there was a local root exploit discovered in the system preferences frameworks that was only fixed in 10.10 and apple says would be too much work to fix elsewhere... effecting 10.7-10.9... so nobody in their right mind should be running anything other than 10.10 at this point (or 10.6 if you have an i386 mac, or 10.5 on ppc, but even then you're better off installing linux to get a secure and up-to-date system). recent xcode can't even target pre-10.9.

@andreyvit

@ttilley In turn, it may be worth adding some CoreFoundation to FSEventsFix. Right now it's all at POSIX level. And I don't think [NSURL initWithPath:] can be considered more correct; it's a thin wrapper around CFURLCreateWithFileSystemPath.

@bdkjones

I can definitely confirm that calling rename() with identical from and to paths does NOT fix a broken folder. You have to go through a temporary-name step for that fix to work.

@andreyvit

@bdkjones Good, thanks. But are you sure we want to silently rename things? Also, how many broken folders remain in your disposal? :)

@bdkjones

Yea, I discussed that with the Apple guy. He says that the entire filesystem is designed to handle things moving around, even if the thing we're moving is /Users. Everything that has a file open in an affected folder has an open file descriptor, which lets them resolve the path to the filesystem object. He says there should be no problems renaming a folder to something temporary and then renaming it back.

@bdkjones

I have zero broken folders left, sadly.

@andreyvit

@bdkjones The scenario I'm worried about is if someone else is watching the folder for changes, and when it goes away, it's registered as a change, and they immediately do something that recreates the missing folder.

Another scenario: you have a Git checkout running, and while waiting, you pre-emptively add the folder to LiveReload/CodeKit. Bam, your Git is complaining about a missing folder. (Or will it? Worth testing, but you can see why I'm worried.)

@bdkjones

Yea, but in theory anything that's watching that folder for changes using FSEvents is also broken and won't receive any events. Of course, if they're watching a parent folder of the broken folder, they will. As will anything that's subscribed to changes using kQueues instead of FSEvents.

We could solve this by showing a dialog box to the user before our apps rename folders. Something like, "Hey, folder X is currently damaged and can't be watched. We can fix that, but you'll want to close any apps that are watching for changes in this folder first."

@andreyvit

Yes, I would definitely want to put up a dialog box before renaming a folder that the user doesn't expect me to rename.

But in that case, I have to question whether I even want to. Given a choice between a silent approach that works (hooking) and displaying a scary confusing dialog box, I'll pick the former any time. If something breaks with a future OS X update, it's not like we cannot update our apps.

@andreyvit

@bdkjones I think I'll remove the one-step check-and-repair method, and will stop recommending it. At the very least, it doesn't seem like something that rb-fsevents or another library should do silently, and they don't have the option of showing UI. You'll probably be the only one shipping the repair code, so I would appreciate if you take over maintenance of the repair method — and until you can contribute a working version, I'll remove the repair method from the headers.

FSEventsFixIsBroken and FSEventsFixCopyRootBrokenFolderPath stay, and are enough to identify and report the bug.

@andreyvit

Good news: I just ran find-fsevents-bugs myself (stupid to do it this late, right?), and I have quite a few under ~/Dropbox. Holy crap, I can test this stuff! (Probably. Still need to verify that it's true.)

@andreyvit

...just fixed find-fsevents-bugs so that it doesn't descend into broken folders. This results in a 100x saner result list, containing only 4 folders in my case.

Unfortunately, they all are reported as Found (FSCopyAliasInfo), which means that realpath data isn't corrupted, but rather FSCopyAliasInfo data is. My understanding is that it means that these folders aren't broken from the point of view of FSEvents. :-(

Btw, ~/Pictures/Photos Library.photoslibrary/Database is on the list as well. Could be a semi-reproducible way to get the corruption.

Can everyone in this chat run find-fsevents-bugs as well? Be sure to update to the latest commit. You can download a pre-built x64 version here: http://cl.ly/3K3f0z1a3W0s

@thibaudgg
Owner

@ttilley I'm totally fine about dropping pre-10.9, go for it!

@andreyvit

For the sake of completeness, here's find-fsevents-bugs report for my root directory: https://gist.github.com/andreyvit/905f8c8c815cfec63a09. I find it fascinating; in some cases (under /Applications), FSCopyAliasInfo is clearly the only correct data left, while both realpath data and the primary directory listing are corrupted.

@ttilley
Collaborator

@andreyvit - my photos db is also broken. realpath() returns Pictures/Photos Library.photoslibrary/Database while my little CFURL dance used in rb-fsevent returns Pictures/Photos Library.photoslibrary/database.

Interestingly enough, both give me correct capitalization for parent folders (i passed in an all-caps path), which I didn't expect realpath to do. That is, until I applied the fseventfix workaround. Then all folders had their case unchanged, including the final folder. The result was PICTURES/PHOTOS LIBRARY.photoslibrary/DATABASE. That's not correct for either result.

@andreyvit

ACTUALLY, all those folders are broken with regards to FSEvents! It seems that FSCopyAliasInfo gives the path reported by the kernel.

Additionally, it appears that doing [NSURL fileURLWithPath:path].fileReferenceURL.filePathURL is equivalent to the FSCopyAliasInfo call. And calling [NSURL initFileURLWithPath:].path doesn't change the case of the input string, so it simply does nothing.

So it actually violates our previous assumptions:

  • We assumed that all Apple APIs short for realpath give the same result. In reality, in my case, enumerating the directory via opendir/readdir, looking at it in Finder, doing ls and calling realpath give the same results, but calling FSCopyAliasInfo, creating a file reference URL and then resolving it, and dragging-and-dropping from Finder into LiveReload give a different result.

  • We (or at least I) assumed that realpath gives an incorrect result, but in reality either side may be broken. ~/Dropbox/Clients/KEY-TEC looks correct in Finder/realpath, but is reported in lowercase by FSCopyAliasInfo, while /Applications/Adobe Photoshop CC 2014/Adobe Photoshop CC 2014.app/Contents/Frameworks/UpdaterNotifications.framework looks lowercase in finder, but is reported correctly by FSCopyAliasInfo.

So FSEventsFix's current implementation of broken folder check is broken. I don't particularly want to use FSCopyAliasInfo, so I guess I'll use CFURL file reference stuff.

@ttilley
Collaborator

@bdkjones This is new information for your dev ticket, btw. In my case above, realpath() gave me what I see when I ls or open in Finder, whereas the CFURL file reference resolving gave me an incorrect result.

@andreyvit

@ttilley Yes, I was operating under several incorrect assumptions, so need to fix the fix now. :-)

But hey, it looks like the broken folders are aplenty, and the Photos.app is a blessing! It may be the reproduction method we've been looking for, and related to a new Apple's app to boot. @bdkjones, I'm sure Kevin is going to like all of this. Be sure to get him to run find-fsevents-bugs on his machine as well.

@andreyvit

To get the icing on the cake, we just need to prove that Time Machine backups of the broken folders are also broken. My time machine disk is in the office, and I'm not; can anyone check?

@andreyvit

Here's my action plan:

  1. Change broken folder detection to use Travis's CFURL dance™.
  2. Change the realpath shim to just call the original and apply CFURL dance™ on top.
  3. PROFIT!

Here's an additional idea for rename fix: what if we only changed case when renaming? That has a very low probability of interfering with other apps because the file system is case-insensitive. And we also don't even need to generate a new name; because realpath != FSCopyAliasInfo, we can rename the path into whatever FSCopyAliasInfo returns and back.

@bdkjones
@bdkjones
@bdkjones
@bdkjones
@bdkjones
@ttilley
Collaborator

@bdkjones I had assumed that Finder would always show the correct case. What you said earlier suggested this. But Finder and realpath agree locally, with CFURL file reference resolving giving a different result. So not all Apple APIs end up returning the correct result then.

@bdkjones
@bdkjones
@andreyvit

@bdkjones ‘Correct’ is a relative term here.

Here's my definitions:

Grade-A correctness: the case that has been used to create the folder (assuming that the folder has never been renamed). And hey, it's extremely unlikely that a mixed-case name could be erroneously produced from a lowercase name, so the mixed-case name is likely always correct.

Grade-B correctness: the case reported by Finder, ls and other enumeration APIs, because that's what I see (and because in most cases this matches the name I originally used, although Adobe guys may beg to differ).

From what I see on my system, grade-A correctness can be on either side, and grade-B correctness is always on realpath's side.

Now, from FSEvents' point of view, FSRef may always be the ‘correct’ one, and realpath the ‘incorrect’ one, but I think we should rather say that kernel uses an incorrect FSRef case when writing to /dev/fsevents, and so FSEvents needs an incorrect FSRef case to work.

@andreyvit

That find-bugs tool will report mismatches one way (realpath is wrong) or another (copyAlias is wrong) depending on what case you pass into the tool.

True for the folder you pass in, but the descendant paths it uses the case returned by readdir, which btw is made much more clear in the updated output format.

Perhaps from the kernel's point of view one of my prior clients is called ‘key-tec’, but I personally called them ‘KEY-TEC’, and Dropbox has a folder called ‘KEY-TEC’, and readdir returns ‘KEY-TEC’, and realpath returns ‘KEY-TEC’, only the NSURL dance™, FSCopyAliasInfo and some unspecified parts of the kernel insist on ‘key-tec’, and I'll be excused for thinking that value highly incorrect. :-)

@andreyvit

@bdkjones But I see your point; for FSEvents, we always need the result of the CFURL dance™, which is already on my action plan. I think we don't have an actual disagreement in this discussion. :-)

@bdkjones

Yea. It's also worth pointing out that I had just two broken folders to test, so by no means is my testing exhaustive—it's quite possible the folders are breaking in other ways.

However, Kevin implied that the path obtained through NSURL (which is just a wrapper on CFURL, which itself is just a wrapper on CFString) should be "The One True Path" as far as the kernel is concerned. That is, CFURL should always match what the kernel spits out to /dev/fsevents.

Whether or not that output is actually the "Correct" path is above our pay grade. Let Apple's HFS+ team worry about that crap. All we care about is getting FSEvents to use the exact path that the kernel is going to spit out.

@ttilley
Collaborator

@andreyvit playing with fishhook locally, it doesn't seem to rebind within the current scope? if I rebind symbols, then call realpath immediately afterward, it calls the original realpath and not the replacement. did I break it? or does replacement only work for external references?

Anyways, here's a fairly naive implementation of realpath() on top of CFURL.

// naive implementation of realpath on top of CFURL
// NOTE: doesn't quite support the full range of errno results one would
// expect here, in part because some of these functions just return a boolean,
// and in part because i'm not dealing with messy CFErrorRef objects and
// attempting to translate those to sane errno values.
// NOTE: the OSX realpath will return _where_ resolution failed in resolved_name
// if passed in and return NULL. we can't properly support that extension here
// since the resolution happens entirely behind the scenes to us in CFURL.
static char* CFURL_realpath(const char *restrict file_name, char *restrict resolved_name)
{
  char* resolved;
  CFURLRef url1;
  CFURLRef url2;
  CFStringRef path;

  if (file_name == NULL) {
    errno = EINVAL;
    return (NULL);
  }

  #if __DARWIN_UNIX03
    if (*file_name == 0) {
        errno = ENOENT;
        return (NULL);
    }
  #endif

  // create a buffer to store our result if we weren't passed one
  if (!resolved_name) {
    if ((resolved = malloc(PATH_MAX)) == NULL) return (NULL);
  } else {
    resolved = resolved_name;
  }

  url1 = CFURLCreateFromFileSystemRepresentation(NULL, (const UInt8*)file_name, (CFIndex)strlen(file_name), false);
  if (url1 == NULL) { goto error_return; }

  url2 = CFURLCopyAbsoluteURL(url1);
  CFRelease(url1);
  if (url2 == NULL) { goto error_return; }

  url1 = CFURLCreateFileReferenceURL(NULL, url2, NULL);
  CFRelease(url2);
  if (url1 == NULL) { goto error_return; }

  // if there are multiple hard links to the original path, this may end up
  // being _completely_ different from what was intended
  url2 = CFURLCreateFilePathURL(NULL, url1, NULL);
  CFRelease(url1);
  if (url2 == NULL) { goto error_return; }

  path = CFURLCopyFileSystemPath(url2, kCFURLPOSIXPathStyle);
  CFRelease(url2);
  if (path == NULL) { goto error_return; }

  if (CFStringGetCString(path, resolved, PATH_MAX, kCFStringEncodingUTF8)) {
    CFRelease(path);
  } else {
    // unable to convert? path too long? not a clue.
    goto error_return;
  }

  return resolved;

error_return:
  if (!resolved_name) {
    // we weren't passed in an output buffer and created our own. free it
    int e = errno;
    free(resolved);
    errno = e;
  }
  return (NULL);
}
@andreyvit

@ttilley I see we've dragged you into this. :-)

It does rebind in the current scope. Look at FSEventsFix, I've simplified and modified fishhook a bit. Your code is very appreciated. I'd love us to work on the same code base, though — the world at large would benefit from a well-tested fix implementation shared between major players.

As for the errors and edge cases, I propose to call the actual realpath first, and only apply CFURL stuff if it succeeded. If realpath succeeds but CFURL fails, just return what realpath returned.

@ttilley ttilley referenced this issue from a commit
@ttilley ttilley implement workaround for issue #10
due to a subtle HFS+ filesystem corruption bug in OSX, the kernel may be
reporting events using a different case than what the fsevents daemon is
expecting. to work around this, we have to override the behavior of realpath()
before calling FSEventStreamCreate() to force fsevents to use the expected
case. note that we're detecting whether or not the hack is required before
overriding realpath(), so most of the time behavior should be the same.
70e91e1
@ttilley
Collaborator

a new release of rb-fsevent has been pushed to rubygems that includes this workaround (only enabled if the issue is detected). leaving this bug open until things settle down a bit.

@bdkjones

I just got an email that Apple Engineering has closed my bug report about realpath() as a duplicate.

I can't put into words how angry I am at Apple Engineering. Either they didn't bother to read the report carefully, or they've known about this issue for a long time and simply don't care about fixing it.

Either way, fuck Apple. I'm going to ship the Fishhook fix. I'm not going to muck around with renaming crap or trying to do Apple's job for them. I'm simply going to tear out realpath() before creating a stream and then swap back in Apple's broken implementation right after creating the stream.

And the next Apple Engineer I meet is getting a swift kick in the ass.

@bdkjones

@andreyvit Thanks for being more talented and dedicated than the clowns in Cupertino.

@thibaudgg
Owner

@andreyvit @bdkjones @ttilley thanks for having resolved that one! :star2:

@bdkjones
@andreyvit

I'm putting the last touches on FSEventsFix. Meanwhile, we've sat down with @tvasenin and found HFS+ driver source code (which is open-source, after all — the fact that was helpfully discovered by first finding some source code references inside a disassembled Mach kernel, lol). We've spent a fair amount of time digging inside, but didn't come across anything that can possibly convert names into lower case. :-( :-)

@andreyvit

FSEventsFix 1.0.0 has just been released using unmodified @ttilley's CFURL_realpath (who's been added to copyrights). I've deleted the repair functions and verified that the fix works on actual broken folders. I've also tried to explain the bug and the workaround in more detail in README.

Thanks to everyone involved; a collaboration like this is a highlight of everything that's good about open-source Cocoa community. It's a lot of pleasure to work with you guys.

@thibaudgg
Owner

:ok_hand:

@andreyvit

...and of course that was not the end of the story. I've decided that a radical simplification of the API and internals is in order, so let me present FSEventsFix 2.0. :-)

Important changes:

  1. It's 10x safer ©: realpath is now only replaced for the FSEvents binary, not for every image in the process, and we're no longer using a dyld hook.
  2. Added FSEventsFixIsCorrectPathToWatch function to check the result of FSEventStreamCopyPathsBeingWatched for the ultimate test of whether the fix has worked.
  3. No more logging and debug options; FSEventsFixEnable returns bool and takes an optional char ** argument for an error message if any.
  4. For the orig_realpath called from our CFURL-based reimplementation, we're now using the proper _realpath$DARWIN_EXTSN instead of the strictly-POSIX _realpath.
  5. The functions no longer try to be thread-safe by executing everything on an internal queue (which has been removed), and in fact there's no mutable global state except for the import table of FSEvents. If the caller wants thread safety, they can bring their own dispatch queue or whatever.
@bdkjones

We've got a new twist to handle.

The FSEventsFix library works great. However, the one thing it does NOT handle is the case where a path is being watched and events are being reported perfectly normally... and then the "singularity" occurs that breaks the folder. Boom: no more events reported.

In my app, the user can fix this by clicking a "refresh" button in my UI, which scans the project folder and recreates the FSEvents stream. But that's manual work, annoying AND the end result is still the same as before the fix: our apps stop responding to changes in certain folders.

Thus far, the only way I can think to combat this is polling. Every 30 seconds, ask the FSEventsStream what paths it thinks it's watching, then test each one with FSEventsFix and, if any of them are broken, recreate the stream.

Anyone have a better approach?

@thibaudgg
Owner

Using polling to make system events work is kind of a joke, but sadly I can't think of better approach.
Is refreshing fast enough?

@andreyvit

@bdkjones While this case is certainly possible, I'd say that when the app stops working, the user will notice and will try restarting it (or clicking a refresh button, if one is available) as the first troubleshooting step. I'm not sure we actually need to do anything.

@andreyvit

@bdkjones On the other hand, the set of broken folders under my Dropbox seems to change at least weekly, so I can imagine an unfortunate case where the user has to do manual actions every few days. But I think I'm temporarily out of energy for this bug and would rather not deal with optional complications. :-)

@ttilley
Collaborator
@andreyvit

I agree with @ttilley; if I were interested in solving the problem, I wouldn't do polling. One option would be to create FSEvents watchers for other potential names (for a normal folder, you can preregister a lowercase name, and for a broken folder, you can register its realpath name). So if the path has N components, you would create 2N watchers instead of one (or you can watch 2N paths with a single watcher, add a trick to FSEventsFix to return different values from each intercepted realpath call).

@mxstbr mxstbr referenced this issue in facebook/react-native
Closed

index.ios.bundle not updating #1427

@ttilley
Collaborator

Sigh. OSX 10.11 might break our workaround without fixing fsevents either. Fuck all to NDAs, from the pre-release documentation:

System Integrity Protection

A new security policy that applies to every running process, including privileged code and code that runs out of the sandbox. The policy extends additional protections to components on disk and at run-time, only allowing system binaries to be modified by the system installer and software updates. Code injection and runtime attachments are no longer permitted.

I haven't tested it yet. I'm not even sure I'll still have broken folders after updating to test with (or if i'll regret updating for numerous other reasons).

@andreyvit

@ttilley We'll see what it actually means in practice. And maybe Apple will fix the bug in 10.11, who knows. I don't think we should worry about this until after the public beta.

@ttilley
Collaborator

@andreyvit well i took the plunge and previously broken folders have fixed themselves, though probably only temporarily... i noticed installd doing an intensive audit for a good 20 minutes. also, google chrome is now the only google app i have installed that will still run. even the google updater service spews error messages. google drive is a complete no go. instant regret.

@bdkjones

You didn't upgrade your main partition, did you? Ouch. OS X previews before DP4 are almost always a disaster. Hope you used a separate partition!

@bdkjones

Also: the Apple DTS Engineer got back to me with more information. His speculation was not correct, or at least it was not the whole story.

Apple has known about the issue that causes our problem with FSEvents for years, although this is the first time anyone has shown them that the issue breaks FSEvents. The underlying problem is way down in VFS and, apparently, fixing it involves a monumental effort and there are other side effects.

My interpretation of that is that we aren't likely to see this problem go away until Apple abandons HFS+ as their disk format. So my money is on 10.11 still having the same issue as 10.10 and previous.

@andreyvit

@bdkjones Any chance you can share the actual DTS exchange with me privately? (I assume @ttilley would love to read it as well.)

@ttilley
Collaborator

indeed i would.

@andreyvit

Holy fucking crap. I've just reproduced the issue on demand.

Before:

% ~/dev/mini/find-fsevents-bugs/find-fsevents-bugs ~/Dropbox  
Result 1:                                                                      
  readdir:         /Users/andreyvit/Dropbox/Budget
  realpath:        /Users/andreyvit/Dropbox/Budget
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/budget
Result 2:                                          
  readdir:         /Users/andreyvit/Dropbox/Clients/Ascendo
  realpath:        /Users/andreyvit/Dropbox/Clients/Ascendo
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/Clients/ascendo
Result 3:                                                                      
  readdir:         /Users/andreyvit/Dropbox/Clients/KEY-TEC
  realpath:        /Users/andreyvit/Dropbox/Clients/KEY-TEC
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/Clients/key-tec
Done, 3 result(s).                                                             

then (naturally, the folder is actually called Writing):

% ls /Users/andreyvit/Dropbox/writing

After:

% ~/dev/mini/find-fsevents-bugs/find-fsevents-bugs ~/Dropbox 
Result 1:                                                                      
  readdir:         /Users/andreyvit/Dropbox/Budget
  realpath:        /Users/andreyvit/Dropbox/Budget
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/budget
Result 2:                                          
  readdir:         /Users/andreyvit/Dropbox/Clients/Ascendo
  realpath:        /Users/andreyvit/Dropbox/Clients/Ascendo
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/Clients/ascendo
Result 3:                                                                      
  readdir:         /Users/andreyvit/Dropbox/Clients/KEY-TEC
  realpath:        /Users/andreyvit/Dropbox/Clients/KEY-TEC
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/Clients/key-tec
Result 4:                                                                      
  readdir:         /Users/andreyvit/Dropbox/Writing
  realpath:        /Users/andreyvit/Dropbox/Writing
 !FSCopyAliasInfo: /Users/andreyvit/Dropbox/writing
Done, 4 result(s).                                                             
@andreyvit

So for those who's not Bryan or Travis, the bug, as explained by the new favorite pet theory of Apple, is that VFS layer sometimes caches a name provided by a userspace process instead of the correct name, and then uses the cached name to answer some API calls.

Based on the above, ‘sometimes’ is actually ‘pretty much all the time’.

Caching is definitely a trivially reproducible part of the problem, but there's also an unknown ingredient that sometimes causes the wrong name to persist across reboots, presumably by getting written to disk (but maybe it just gets broken on every boot).

@bdkjones

@andreyvit Huh? You reproduced it just by calling ls? I was doing that for weeks and did not get that on my end. Can you confirm that this breaks any folder on demand?

@andreyvit

@bdkjones I did! But maybe I just lucked out. I cannot do it again myself. :-( There's some other precondition required.

@ttilley
Collaborator

poked a kernel dev staffing a lab at WWDC. lol. https://twitter.com/jauricchio/status/608663324004188160

@andreyvit

@ttilley They actually had a dedicated file systems lab there (Tue 9:00). Not sure how to find someone from it, though.

@ttilley
Collaborator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.