New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move File and IO::FileDescriptor's platform-specific parts to Crystal::System #5333

Merged
merged 3 commits into from Dec 13, 2017

Conversation

Projects
None yet
4 participants
@RX14
Member

RX14 commented Nov 28, 2017

Hopefully the third time lucky we'll manage to merge an IO portability refactor.

I've copied the commit messages out below so people actually read them. I also suggest you review each commit in order, instead of reviewing the diff as a whole. it makes the entire thing a lot more readable and understandable.

Make IO::FileDescriptor fcntl private

fcntl() is an extremely low-level platform-specific detail which should only be used internally to IO::FileDescriptor. In this commit, it is removed from the public API and made private.

The only usage of this API was added only recently, in Crystal::Main. It is replaced with a method which calls LibC.fcntl directly, since it is too early in init to create a new IO::FileDescriptor instance directly and ask it if it's blocking.

Separate platform-specific parts of IO::FileDescriptor

In order to add windows support to Crystal, IO::FileDescriptor cannot depend on posix APIs. Here, we refactor out the platform-specific parts of IO::FileDescriptor into a mixin module in Crystal::System. Some platform-specific methods are refactored out of methods which contain both platform-specific and cross-platform behaviour. These methods are prefixed with system_ and are private.

Make Tempfile inherit File

Making Tempfile inherit from File allows the instance to be passed to methods which expect File, and makes Tempfile much more powerful. This requires adding a new private constructor to File which sets @path and calls super.

Separate platform-specific parts of File

Similar to previous patch: Separate platform-specific parts of IO::FileDescriptor.

@RX14 RX14 force-pushed the RX14:feature/io-crystal-system-2 branch from b0e0631 to fd81b4d Nov 29, 2017

@straight-shoota

This comment has been minimized.

Contributor

straight-shoota commented Nov 29, 2017

In the second commit, the extracted platform-specific methods are prefixed platform_. However, in the commit message it says to be system_.

I'd prefer system_ because it matches the namespace and is more descriptive.

@RX14

This comment has been minimized.

Member

RX14 commented Nov 29, 2017

I'll wait a few days to see if there is consensus on naming. Then I'll pick one and make the commit message and diff match up.

If the only problem you could find is naming, I'm happy!

raise "Invalid access mode #{mode}"
end
else
raise "Invalid access mode #{mode}"

This comment has been minimized.

@Papierkorb

Papierkorb Nov 29, 2017

Contributor

Doesn't support a mode like rb+. Also, #{mode.inspect} would make it clearer in the output what comes from the user and what's the error message itself

This comment has been minimized.

@RX14

RX14 Nov 29, 2017

Member

This is directly copied from the old file.cr if it has issues, it's out of the scope of this PR.

This comment has been minimized.

@Papierkorb

Papierkorb Nov 29, 2017

Contributor

Fair enough

path = "#{tmpdir}#{name}.XXXXXX#{extension}"
if extension
fd = LibC.mkstemps(path, extension.bytesize)

This comment has been minimized.

@Papierkorb

Papierkorb Nov 29, 2017

Contributor

mkstemps(3) reads: Since it [the template] will be modified, template must not be a string constant, but should be declared as a character array. That's not obvious from the function itself (At least to a non-C user), there should be atleast a comment noting that the string will be modified (Which isn't "possible" from Crystal), or it could be copied into a temporary Bytes.

This comment has been minimized.

@RX14

RX14 Nov 29, 2017

Member

ditto

def stat(path)
if LibC.stat(path.check_no_null_byte, out stat) != 0
raise Errno.new("Unable to get stat for '#{path}'")

This comment has been minimized.

@Papierkorb

Papierkorb Nov 29, 2017

Contributor

path.inspect would give proper visual "escaping"

This comment has been minimized.

@RX14

RX14 Nov 29, 2017

Member

ditto

def real_path(path)
real_path_ptr = LibC.realpath(path, nil)
raise Errno.new("Error resolving real path of #{path}") unless real_path_ptr

This comment has been minimized.

@Papierkorb

Papierkorb Nov 29, 2017

Contributor

Same path.inspect, this and following error messages omit the single-quote. (Consistency)

This comment has been minimized.

@RX14

RX14 Nov 29, 2017

Member

ditto

@Papierkorb

This comment has been minimized.

Contributor

Papierkorb commented Nov 29, 2017

I think there needs to be a proper support for platform-specific functionality in the stdlib. IMHO, throwing stuff out like #fcntl (The easy access to it) can be harmful: The wrapper methods are usually much easier and nicer to use; The LibC pendants are harder to get right.

Truth is, programs tend to need platform-specific stuff that's not yet nicely wrapped. I doubt that you'll be able to offer everything that the control functions support in an agnostic manner. And many programs are fine if they don't work on some random platform, as long they work where they're intended to be used. Diminishing all API to the most common denominator will only hurt on the long run.

Outside the scope of this PR, I'd actually like to see a documentation flag to note platform information right in the in-source docs. The docs generator could then mark such a section visually, or add a "Tag badge" to it.

@RX14

This comment has been minimized.

Member

RX14 commented Nov 29, 2017

re: removing fcntl, I actually think you're right. I'll probably revert that commit.

@RX14

This comment has been minimized.

Member

RX14 commented Nov 29, 2017

I appreciate these comments, but comments (in support of or against) about the approach of using a mixin module from Crystal::System would be most helpful to tell whether this PR is going to get merged at all.

@RX14 RX14 referenced this pull request Nov 29, 2017

Closed

[WIP] Windows #3582

6 of 24 tasks complete
@ysbaddaden

I'm not sure mixins are better than composition, but that should really help going forward, so I'm gonna say 👍 ! I guess we'll probably rework it later when there is windows support, but... later. Let's go forward —as long as the public API is good and stable.

I don't think we need the Mixin suffix when naming modules, I mean include Crystal::System::FileDescriptor is just fine, and the filenames don't have the _mixin suffix anyway (crystal/system/file_descriptor.cr).

About platform-specific methods, there are usages of the _impl suffix in different places in the stdlib. Maybe we could stick to that? My point is that "private impl" can englob the "target-specific" meaning.

src/file.cr Outdated
class File < IO::FileDescriptor
private alias Sys = Crystal::System::File

This comment has been minimized.

@ysbaddaden

ysbaddaden Nov 30, 2017

Member

I'd avoid the alias and use the long naming.

# TODO: use fcntl/lockf instead of flock (which doesn't lock over NFS)
# TODO: always use non-blocking locks, yield fiber until resource becomes available
def flock_shared(blocking = true)

This comment has been minimized.

@ysbaddaden

ysbaddaden Nov 30, 2017

Member

Would it be opportune to break the file lock API now? The current flock naming and API is tied to the underlying flock POSIX implementation, which is bad. What about abstracting it now, since this patch is meant to support windows, eventually.

enum Lock
  Shared
  Exclusive
end

def lock(operation : Lock, blocking = false, &block)
end

def lock(operation : Lock, blocking = false)
end

def unlock
end

This comment has been minimized.

@RX14

RX14 Nov 30, 2017

Member

I'd tentatively say that's easily done in a latter PR - and so it should be.

This comment has been minimized.

@ysbaddaden

ysbaddaden Nov 30, 2017

Member

So be it. Let's stick to extracting posix-specific calls in this PR.

This comment has been minimized.

@RX14

RX14 Nov 30, 2017

Member

I'd recommend making an issue to track that though, we seem to forget a lot of these refactors once the PR gets merged.

@RX14

This comment has been minimized.

Member

RX14 commented Nov 30, 2017

@ysbaddaden if you look at File, there's a Crystal::System::File and a Crystal::System::FileMixin. The Mixin naming comes from there.

@ysbaddaden

This comment has been minimized.

Member

ysbaddaden commented Nov 30, 2017

Hum, I prefer to use def self. instead of extend self when there is no actual benefit to using the module methods as both includeable and class methods (for example Random::Secure does benefit). This would avoid having System::File and System::FileMixin and but just System::File, with some class methods, and some instance methods to be included in ::File.

Fairly minor, but it looks better to my eyes.

@RX14

This comment has been minimized.

Member

RX14 commented Nov 30, 2017

@ysbaddaden sounds good to me. I'll change it.

@RX14

This comment has been minimized.

Member

RX14 commented Dec 1, 2017

I'm still undecided about naming. I think a system_ prefix is better than _impl, because there's a difference between just a class-private implementation detail and a platform-specific method. system corresponds to the namespace, so it'd a good name.

@ysbaddaden

This comment has been minimized.

Member

ysbaddaden commented Dec 1, 2017

To me it boils down to "implementation detail", but that's pedantic nitpicking, no need to spend hours on such details.

@RX14 RX14 force-pushed the RX14:feature/io-crystal-system-2 branch from fd81b4d to 70bb1e0 Dec 1, 2017

@RX14

This comment has been minimized.

Member

RX14 commented Dec 1, 2017

Updated this PR. Changed the naming to system_, and removed the Mixin naming. Also re-added fcntl and moved the IO::System-specific parts to the mixin (windows doesn't use it yet because there's no evented IO).

@RX14 RX14 referenced this pull request Dec 1, 2017

Merged

"hello world" windows port #5339

@RX14

This comment has been minimized.

Member

RX14 commented Dec 1, 2017

Also, if this gets merged, please make sure to rebase, not just squash. I think it'll make for a better commit history.

private def unbuffered_rewind
seek(0, IO::Seek::Set)
self
self.pos = 0

This comment has been minimized.

@ysbaddaden

ysbaddaden Dec 4, 2017

Member

Why the change?

This comment has been minimized.

@straight-shoota

straight-shoota Dec 4, 2017

Contributor

pos=(value) calls seek(value) and Seek::Set is default. So seek(0) would be the most direct call here.

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

No reason for the change, it's trivial to see it's the same but I can revert it if you want.

end
end
private def system_pos

This comment has been minimized.

@ysbaddaden

ysbaddaden Dec 4, 2017

Member

I'd prefer system_tell over system_pos.

I wish we'd drop IO#pos and IO#pos= and kept IO#seek and IO#tell instead (less duplication, more explicit naming).

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

We just need 1 call: seek which always returns the current position. I suggested that IO implement pos pos= and tell in the base class to @asterite and simply provide a raising IO#tell by default.

For now, this is how IO works. It's not great, but it is how it is until I refactor.

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

Personally i'd rather keep pos, pos= and seek. tell is obscure and weird naming to me.

This comment has been minimized.

@ysbaddaden

ysbaddaden Dec 4, 2017

Member

Well, it's "seek and tell". They're well known associated terms, whatever the language. E.g. C, Haskell, Python, Ruby or Rust, all have both.

I don't like the pos naming, it's an abbreviation and inexplicit (inherited from Ruby). At least position would be a more explicit name but I still prefer tell (broadly used). Using a setter (pos=) to perform an action, that is move the cursor in a file, sounds awful to me, in addition to be an exact duplicate for seek, too.

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

I have to say "seek" is not an unknown term to me, although "tell" is. I agree with you on pos vs position, but not that setters shouldn't have performance side-effects.

This comment has been minimized.

@straight-shoota

straight-shoota Dec 4, 2017

Contributor

As far as I can tell, the terms tell and seek come from working with files. IO is a general class, and to me it doesm't seem logical to transfer this meaning from files to any other stream. But I may be mistaken.
Another thought: pos and pos= is a concept easy to understand. One reads the current stream position, the other sets it (position and position= are a bit longer but also fine). With seek and tell it is more complicated, if your not familiar with these terms. You have to remember two names instead of one and can't simply asseses their meaning.

This comment has been minimized.

@Papierkorb

Papierkorb Dec 5, 2017

Contributor

I'm also for dropping #seek/#tell and using #position/#position= (Not #pos). The latter is really clear in intention. Also, it lets you seek around in a file using normal code:

my_file.position += 5 # This is how you'd write it anywhere else too! Fantastic!
my_file.seek(my_file.tell + 5) # Compared to the one above: Meh.
my_file.seek(5, File::SEEK_CUR) # I can never remember which SEEK_* I need - this sucks.

This comment has been minimized.

@RX14

RX14 Dec 5, 2017

Member

I think seek should stay, io.position = io.size - 5 involves 2 syscalls whereas io.seek(-5, IO::Seek::End) requires only one. Similar for io.position += 5. That being said, I think that using io.position is a much nicer interface which is more than adequate for most people. Perhaps some caching for size and position would mean that we can obviate seek entirely.

Regardless of that, we should make a new issue containing all the IO refactor suggestions from this thread. Unless anyone else volunteers to make that issue now, i'll get it done once this PR is merged.

r
end
def fcntl(cmd, arg = 0)

This comment has been minimized.

@ysbaddaden

ysbaddaden Dec 4, 2017

Member

IO#fcntl method is supposed to be public, but it won't be documented?

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

It wasn't documented before, I can add documentation if you want.

This comment has been minimized.

@ysbaddaden

ysbaddaden Dec 4, 2017

Member

No, I meant: will the #fcntl method still appear in the documentation?

This comment has been minimized.

@RX14

RX14 Dec 4, 2017

Member

Hmm, not sure. Let me check.

This comment has been minimized.

@RX14

RX14 Dec 5, 2017

Member

They actually break the doc generator by saying they're included from the module, but the module is :nodoc: so that the links don't work.

This comment has been minimized.

@RX14

RX14 Dec 5, 2017

Member

I'll go and do this method the same as the rest - with the system_ prefix and {% ifdef %} the method out.

This comment has been minimized.

@straight-shoota

straight-shoota Dec 5, 2017

Contributor

Yes, I've noticed this, too. Inclusion and extension of modules that are not documented should probably be hidden in the docs generator.

This comment has been minimized.

@RX14

RX14 Dec 5, 2017

Member

I pushed a fixup commit which resolves this. If this is OK i'll rebase to allow this to be merged.

@RX14

This comment has been minimized.

Member

RX14 commented Dec 4, 2017

There have been loads of great ideas about refactoring IO::FileDescriptor and IO here, but I think that i'll work on a later PR which basically contains all this. I want to really focus on windows this christmas, instead of the IO refactors that work uncovers.

If people feel that some of that work would be better in this PR however, i'm happy to include it.

@RX14

This comment has been minimized.

Member

RX14 commented Dec 4, 2017

Plus, a larger scope pf the PR means I have the chance to perform these same refactors on Socket and such. I really feel that these should be seperate PRs for more reasons than wanting to get this merged.

@RX14 RX14 requested a review from ysbaddaden Dec 6, 2017

@RX14

This comment has been minimized.

Member

RX14 commented Dec 12, 2017

Ping? I think I've solved all the reviews, and all this should need is a tiny review of the fixup commit from @ysbaddaden or a review from another core team member. I think this PR is fine to land in 0.24.1.

RX14 added some commits Nov 28, 2017

Separate platform-specific parts of IO::FileDescriptor
In order to add windows support to Crystal, IO::FileDescriptor cannot depend on
posix APIs. Here, we refactor out the platform-specific parts of
IO::FileDescriptor into a mixin module in Crystal::System. Some
platform-specific methods are refactored out of methods which contain both
platform-specific and cross-platform behaviour. These methods are prefixed with
`system_` and are private.
Make Tempfile inherit File
Making Tempfile inherit from File allows the instance to be passed to methods
which expect File, and makes Tempfile much more powerful. This requires adding a
new private constructor to File which sets @path and calls super.
Separate platform-specific parts of File
Similar to previous patch: Separate platform-specific parts of
IO::FileDescriptor.

@RX14 RX14 force-pushed the RX14:feature/io-crystal-system-2 branch from ccdd6b7 to 586846d Dec 13, 2017

@RX14

This comment has been minimized.

Member

RX14 commented Dec 13, 2017

Build failure is known related to travis's new build images: travis-ci/travis-ci#8891.

@RX14 RX14 merged commit 5fd965e into crystal-lang:master Dec 13, 2017

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
ci/circleci Your tests passed on CircleCI!
Details

@RX14 RX14 added this to the Next milestone Dec 24, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment