Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Solaris FEN support #263

Closed
wants to merge 9 commits into from
Closed

Conversation

@isaacdavis
Copy link

@isaacdavis isaacdavis commented Aug 28, 2018

This pull request fixes #12.

I'm an employee of Joyent, Inc., and have completed the Google corporate CLA.

What does this pull request do?

This pull request adds fsnotify support for Solaris and illumos via Solaris FEN. It is heavily based on #196, authored by @gfrey. As such, and with his approval, I have added @gfrey in the AUTHORS file in addition to myself.

One major revision to @gfrey's work concerns liveness when the watcher is closed when there is an unconsumed event in the channel - the old pull request would sometimes hang on the TestWatcherClose test. The new concurrency model is borrowed from fsnotify's inotify implementation.

Another major revision concerns proper translation of FEN events to fsnotify events, in particular the FILE_RENAME_TO FEN event.

Where should the reviewer start?

These pages are useful as an introduction to FEN:

The illumos man pages for port_create and related functions are also helpful for understanding how to interface with FEN.

The discussion on #196 is also worth reading. Some points I'll address here:

  • The use of syscall appears necessary here because golang.org/x/sys/unix does not (to my knowledge) provide a way to get atime and ctime.
  • cgo is used because, for illumos, the officially supported interface to the OS is libc rather than syscalls, and thus user-level code should refrain from making syscalls directly.

How should this be manually tested?

I ran all tests on a SmartOS host. On SmartOS, Go can be installed from Joyent's pkgsrc repository. fsnotify can then be built and tested as normal.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Aug 30, 2018

The Travis CI build is failing only on MacOS on go 1.8. I'm not sure why this is, given that the code I changed affects only Solaris. I'm additionally unable to reproduce the failure on my own Mac. If anyone can offer insight, I would much appreciate it...

@gfrey gfrey mentioned this pull request Aug 30, 2018
@gfrey
Copy link

@gfrey gfrey commented Aug 30, 2018

I created issue #264. Seems to be a flake (it was locally and last PR builds seem to have worked).

@nathany
Copy link
Member

@nathany nathany commented Aug 30, 2018

#264 is now fixed on master. Thanks @gfrey. Please rebase.

Is there a CI solution for Solaris?

Is this sufficiently reviewed by people familiar with Solaris? I'll also do a review before it is merged.

@gdey
Copy link
Contributor

@gdey gdey commented Aug 31, 2018

kick integrations.

@gdey gdey closed this Aug 31, 2018
@gdey gdey reopened this Aug 31, 2018
@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Aug 31, 2018

Looks like macOS builds are still failing even with the fix.

@nathany Joyent would be happy to offer a Triton container on the Joyent Public Cloud in which to build and test fsnotify. Under-the-hood this is a SmartOS zone that we'd administer and give you access to. I could assist with installing Go and setting up the build. You could then integrate this with the CI system of your choice. Please let me know if you're interested or have questions - feel free to email isaac.davis@joyent.com.

@nathany
Copy link
Member

@nathany nathany commented Aug 31, 2018

@isaacdavis A CI solution would be great, though I can't say I have any idea how to integrate a Triton container with a hosted solution like Travis CI.

Also, before adding Solaris support, I want to be sure there is a point person (or people would be better) for Solaris support that are handling any issues. /cc @gfrey

We also want to be sure that Solaris support is documented, especially if there are behaviour differences between operating systems that users need to know about.

One other note: at GopherCon we have been discussing standalone packages for each OS that fsnotify relies on. This would likely look like github.com/fsnotify/fsnotify/fen. @gdey is starting on a shared test helper package to reuse across packages.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Aug 31, 2018

@nathany It's a little hacky, but I suppose in Travis you could write a script that ssh'ed in to the container and ran the tests there, sending back the results. This would also work with any other hosted CI solution, I think.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Sep 5, 2018

I've discovered a bug in the current implementation under review:

The port_associate function accepts a pointer to an object describing the entity being watched. port_get, when returning from an event, returns the same pointer as the portev_object member of the port_event_t struct returned from port_get. Thus, if the event-handling code uses this pointer to retrieve information about the file affected (as we currently do), it must still point to valid memory.

In associateFile, we construct a file_obj and pass a pointer to it to port_associate. The Go runtime can relocate or garbage-collect this file_obj, rendering the pointer invalid. We thus may get garbage data back when we retrieve the object upon returning from port_get.

This bug does not manifest in the test suite, as there is not enough memory churn to trigger an overwrite of the file_obj memory. I discovered it when running Prometheus on SmartOS using fsnotify to watch config files.

I'll have a fix committed by tomorrow (2018-09-06) at noon PST.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Sep 27, 2018

I'd like to check up on the status of this - what work needs to be done to get this merged? Please let me know what I can do to help.

Copy link
Member

@nathany nathany left a comment

Thanks for your work on this Isaac.

For this review, please keep in mind that I'm not familiar with FEN, nor have I exercised this code or even ensured that it compiles (I don't have a Solaris environment, nor am I personally planning to set one up).

Structurally, I would prefer if the cgo code were more isolated as we have done for fsevents, but I'm not sure if that can even be done to the same extent.

It appears that FEN requires that a malloced file_obj_t remain allocated for the lifetime of the watch, is that correct?

fen.go Outdated Show resolved Hide resolved
fen.go Outdated Show resolved Hide resolved
fen.go Outdated Show resolved Hide resolved
fen.go Outdated

// Handle all children of the directory.
for _, finfo := range files {
err := handler(filepath.Join(path, finfo.Name()), finfo)
Copy link
Member

@nathany nathany Sep 27, 2018

This list of of files includes subdirectories.

Is this the behaviour you want? To call associateFile/disassociateFile on a subdirectory?

(keeping it non-recursive for now is probably best, as the other platforms are not recursive yet)

Copy link
Author

@isaacdavis isaacdavis Oct 8, 2018

My understanding is that, if we add a watch to a directory, we want to watch all of its immediate children for changes - this would include children that are directories. This isn't recursive in that we don't descend into those subdirectories and watch the files/directories therein, walking down the entire tree - we just watch the subdirectory vnode itself and stop there.

Is this incorrect? Should children that are directories be explicitly omitted?

Copy link
Member

@nathany nathany Oct 8, 2018

That's correct.

Copy link
Author

@isaacdavis isaacdavis Oct 9, 2018

fixed.

fen.go Outdated Show resolved Hide resolved
fen.go Outdated Show resolved Hide resolved
@nathany
Copy link
Member

@nathany nathany commented Sep 27, 2018

@isaacdavis Apologies for the delay in reviewing your work. Here are some considerations before merging this.

I'm not personally willing to take on the responsibility of supporting Solaris and FEN myself, setting up CI, or any of that. What I'd like to see is yourself, but preferably multiple people, taking ownership of the FEN support.

The Solaris group would work with myself and hopefully some others on cross-platform API considerations (rename to, recursive watching, filtering, etc.). If there are bugs in the FEN support, I need to know there are people familiar with FEN that will triage issues and provide fixes. If/when we do a v2 API, I need to know there are people familiar with FEN that will help us move fsnotify forward on all platforms.

Honestly, I'd like to "recruit" small teams for each platform, but Solaris is a good place to start.

@nathany
Copy link
Member

@nathany nathany commented Sep 27, 2018

@isaacdavis In the near term I'd like to break fsnotify up into a few packages within the same repository. I've already begun some work in that direction with kqueue, by isolating the syscall wrappers.

This would allow someone to import "github.com/fsnotify/fsnotify/fen" and use it directly if they really want. My idea with the wrapper is to provide a very low-level platform-specific Go-ish API that provides documentation and testing (improving our means of testing fsnotify will be part of this effort). Also, it could provide all the things for the platform, whereas fsnotify is lowest common denominator.

Not saying this needs to happen before the PR is merged, but letting you know the direction I'm thinking of taking things.

The big question is how low-level is very low level. For example, the kqueue API takes file handles, not names of files to watch, so the kqueue wrapper may operate at that low level with very little in the way of amenities.

Part of my goal with this is to associate ownership to each subpackage. A small team managing each of fen, inotify, kqueue, fsevents -- possibly some people on multiple teams, and everyone participating in the common API design as well.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Sep 27, 2018

@nathany Thanks for your thorough responses. I'm happy to take ownership of FEN support and get involved in this project!

Regarding CI: After talking with some others at Joyent, we think the easiest solution would be to set up a Jenkins instance on a VM in the Joyent Public Cloud, and run the build and tests there. This would integrate well with Github. I would be your point of contact for administering this. Does that sound reasonable to you?

One other note: I only have the resources to provide illumos support - the build and tests would run on illumos, not Oracle Solaris. illumos diverged from Oracle Solaris in 2010, and there is no guarantee that the two operating systems share a common FEN API.

If this all sounds reasonable, let me know and I'll go ahead with addressing the code review and setting up CI. Thanks!

@nathany
Copy link
Member

@nathany nathany commented Sep 27, 2018

@isaacdavis Thank you. That all sounds good to me. Use whatever CI works for you [though I will mention Drone.io may be worth looking into, as an open source alternative to Jenkins written in Go]

I think we'll document in the README who the teams are for each platform with respective GitHub handles, and perhaps figure out how to do some @ lists [I can look into these things].

Hopefully we will see some other people commit to support FEN, and maybe even Solaris specifically, but we can move ahead either way if it's okay with you. /cc @gfrey @4ad @amitkris

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Sep 27, 2018

@nathany Sounds good. I'll get started setting everything up.

fsnotify.go Outdated
@@ -62,5 +62,5 @@ func (e Event) String() string {
return fmt.Sprintf("%q: %s", e.Name, e.Op.String())
}

// Common errors that can be reported by a watcher
// ErrEventOverflow: Common errors that can be reported by a watcher

comment on exported var ErrEventOverflow should be of the form "ErrEventOverflow ..."

Copy link
Member

@nathany nathany Mar 8, 2019

note: if you rebase on master, this was resolved

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Jan 7, 2019

Hi @nathany - happy 2019! Just checking up on this. Joyent is still eager to get FEN support merged. I've updated my above comment about Jenkins slightly.

@nathany
Copy link
Member

@nathany nathany commented Mar 8, 2019

@isaacdavis An "fsnotify automation" account sounds good. I've also sent you an invite to grant write access and can bump your permissions further once accepted.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Mar 8, 2019

@nathany great - I've accepted the invite.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Mar 11, 2019

@nathany ok, I've created an automation account. This account will need to be given write access to the repo -- once that's done, I can set up the Jenkins webhook and build job.

@jubalskaggs
Copy link

@jubalskaggs jubalskaggs commented May 31, 2019

@isaacdavis thank you for your work on this - I'm just chiming in to say that we're testing this pr on solaris 11 (also illumos) and it's working just fine. I'd love to see solaris-fen merged into mainline fsnotify :)

@nathany
Copy link
Member

@nathany nathany commented Sep 14, 2019

Apologies for the delay. It looks like this pull request has fallen out of date a bit and still needs a review by someone familiar with FEN.

@Smithx10
Copy link

@Smithx10 Smithx10 commented Jan 12, 2020

@isaacdavis @nathany I'm not sure if go 1.13 came out yet in May of 2019, but with it came the "illumos" build tag. Are we waiting on someone from Solaris? If so perhaps @isaacdavis would only have to maintain the Illumos tag since they are different at this point.

@Smithx10
Copy link

@Smithx10 Smithx10 commented Jan 12, 2020

I tried Cross Compiling from Linux to Illumos (go 1.13) https://github.com/papertrail/go-tail with this PR and received the following error:

../../../go-tail/follower/follower.go:55:12: undefined: fsnotify.Watcher

On Illumos it builds fine. Is this to be expected?

I'm assuming I am missing something to make the cross compiling work.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Jan 14, 2020

@Smithx10 @nathany It's new information that this doesn't build on Solaris, though I'm not surprised given the divergence between the two operating systems. Now that the illumos build tag exists, I think it's a good idea for fsnotify to begin differentiating between illumos and Solaris to avoid these types of issues.

@nathany -- do you have an interest in introducing the illumos build tag to fsnotify? I'm happy to create a PR for that and maintain the illumos tag. I only have the resources to build and test on illumos, so we'd have to find someone else to maintain the Solaris tag going forward.

The remaining work of getting another reviewer and granting access to the Jenkins server still stands. I realize I've neglected this, but I'm happy to get back to it. @nathany just confirm that you're willing for fsnotify to begin using the illumos build tag before I go ahead with a PR.

Thank you both!

@Smithx10
Copy link

@Smithx10 Smithx10 commented Jan 22, 2020

@nathany @isaacdavis any update on this, or perhaps some flags for the cgo cross compile?

@Smithx10
Copy link

@Smithx10 Smithx10 commented Jan 24, 2020

@nathany Anything I can do to help move this along?

@Smithx10
Copy link

@Smithx10 Smithx10 commented Jan 28, 2020

./inserts jeopardy music here.

@isaacdavis
Copy link
Author

@isaacdavis isaacdavis commented Feb 1, 2020

@Smithx10 I've played around with this a little bit -- my hunch is that cross-compilation of programs that use cgo is difficult, because you'd also need to set up the C toolchain to cross-compile. This would involve obtaining a linux-to-illumos C cross-compiler and setting it up to link with the illumos libraries, etc.

I'm not sure if this has ever been done -- all I could find via google was this: https://gist.github.com/AlainODea/6679613

@Smithx10
Copy link

@Smithx10 Smithx10 commented Feb 1, 2020

For cross compilation support I think we can rewrite this just using go. I believe that’s what the other distros did.

@Toasterson
Copy link

@Toasterson Toasterson commented Mar 26, 2021

Hello everyone,

I see that this PR is stuck. Is there anything we from the illumos community can do to advance this @nathany ?

@nshalman
Copy link
Contributor

@nshalman nshalman commented Jun 2, 2021

I'm working on getting these Event Ports primitives into x/sys/unix and then cleaning up this code to use those instead of cgo.
So far I have some basic bits ported (though not yet upstreamed) sitting in this branch:
nshalman/sys@master...nshalman:fsnotify-fen

I also have the beginnings of the port of this code to use those changes here:
nshalman#1

Unfortunately that code currently passes most but not all of the integration tests, so I'm currently a bit confused and temporarily blocked. Eventually when I get it all working and get the x/sys/unix changes through the review and upstreaming process, I'll inevitably further refactor my fsnotify code and then open a PR to supersede this one.

My goal is to remove all use of cgo and unsafe in the fsnotify code.

nshalman added a commit to nshalman/sys that referenced this issue Jun 3, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: I2bc50036087a038c2aa8b6178687d1a870b9d1fd
nshalman added a commit to nshalman/sys that referenced this issue Jun 3, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
nshalman added a commit to nshalman/sys that referenced this issue Jun 8, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
nshalman added a commit to nshalman/sys that referenced this issue Jun 13, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: I2bc50036087a038c2aa8b6178687d1a870b9d1fd
nshalman added a commit to nshalman/sys that referenced this issue Jun 13, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
nshalman added a commit to nshalman/sys that referenced this issue Jun 13, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
nshalman added a commit to nshalman/sys that referenced this issue Jun 17, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
@nshalman
Copy link
Contributor

@nshalman nshalman commented Jul 1, 2021

For anyone checking in here, https://go-review.googlesource.com/c/sys/+/324630 is close to complete, and an updated PR to supersede this one will follow after that finally lands.

gopherbot pushed a commit to golang/sys that referenced this issue Aug 17, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: Ibd7500d20322765bfd50aa18333eb43ee7b659d7
Reviewed-on: https://go-review.googlesource.com/c/sys/+/324630
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
gopherbot pushed a commit to golang/sys that referenced this issue Aug 17, 2021
This work is in support of a cleanup of fsnotify/fsnotify#263

Change-Id: I2bc50036087a038c2aa8b6178687d1a870b9d1fd
Reviewed-on: https://go-review.googlesource.com/c/sys/+/324629
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
@nathany
Copy link
Member

@nathany nathany commented Aug 18, 2021

Closing in favour of #371. Thanks for working on this.

@nathany nathany closed this Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

9 participants