Skip to content

Conversation

zielmicha
Copy link
Contributor

(rationale available as RFC #5753)

This PR adds views module together with View[T] type. It is defined as a triple:

type View[T] = 
  data: ptr T
  length: int
  when needGcKeep: # not needed for gc:none and gc:boehm
    gcKeep: RootRef
  • initView creates view pointing into existing memory, newView creates view by copying seq or string
  • copyAsSeq, copyAsString convert the view to string/seq
  • slice takes a subview of a view
  • copyFrom, copyTo copy data from views around
  • [], []=, iterator items are available


proc newView*(s: string): ByteView =
## Copies a string and returns a new view pointing into the copy.
let copied = new(string)
Copy link
Member

@yglukhov yglukhov Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why copy string? It could be shallowCopied instead. And no need for ref string variant, IMO

Copy link
Contributor Author

@zielmicha zielmicha Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. shallowCopy is counter-intuitive here. From system docs:

Be careful with the changed semantics though! There is a reason why the default assignment does a deep copy of sequences and strings.

Also: notice that ref string variant is not exported, because it's not safe when the string is resized.

@yglukhov
Copy link
Member

yglukhov commented Jun 7, 2017

This implementation looks completely unfriendly to JS

@zielmicha
Copy link
Contributor Author

Hmm, I guess JS needs separate implementation. At least fast C interop doesn't apply there.

@zielmicha
Copy link
Contributor Author

This implementation looks completely unfriendly to JS

@yglukhov Is JS backend mature enough to consider implementing views for it? Previous time I checked (~two years ago) it wasn't possible to write nontrivial program using it and now I failed too (#5974).

@krux02
Copy link
Contributor

krux02 commented Jul 24, 2017

Have you thought about supporting the slicing notation? myseq.initView(0..^1)

@zielmicha
Copy link
Contributor Author

Good idea. Maybe I should just add [] operator for slices (so initView(myseq)[0..^1]).

@zielmicha
Copy link
Contributor Author

@dom96 Do you think this can be merged? Any specific comments?

@yglukhov
Copy link
Member

@zielmicha, actually I've got a philosophical comment regarding the gcKeep field. I think in the very base impl of View there should be no gcKeep to reduce the overhead even further. I don't want to stall this PR by any means, because its already better than nothing, but can we do anything about it still keeping the api safe by default, and unsafe as an option?

data: ptr T
length: int
when needGcKeep:
gcKeep: RootRef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you need both types, the type that keeps the underlying datastructure alive and one that doesn't. I expect a View object to be passed down the call stack, but it would not be written to some variable. So It would not need to keep the original object alive. The need for a gcKeep would be an exceptional case. But your implementation only allows to have it either globally enabled or disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needGcKeep is defined as not (compileOption("gc", "boehm") or compileOption("gc", "none")), so it can't be disabled even globally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that is what I mean. It can be disabled/enabled globally. But it should be possible to enable/disable it locally. Meaning a different type that has the RootRef and one that does not. Similar to ptr/ref being a different type. One does keep the object alive the other one doesn't.

# View[char] and View[byte] are mostly the same thing
return initView(cast[ptr byte](s.data), s.len, when needGcKeep: s.gcKeep else: nil)

proc initView(s: ref string): ByteView =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both initView that accept string as well as seq only allow this, when the seq/string is a ref. To my knowledge it is not safe to cast a ptr to a ref, therefore it is not possible to "view" into a seq that is not ref type.

return v.length

proc ptrAdd[T](p: pointer, i: int): ptr T =
return cast[ptr T](cast[int](p) +% (i * sizeof(T)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you use a pointer to an unchecked array? That is a pointer with pointer arithmetics.

@zielmicha
Copy link
Contributor Author

zielmicha commented Sep 26, 2017

@yglukhov The only cost is that passing View as an argument copies 24 bytes, not 16. I'm not sure if introducing two types is worth this small saving.

At least in my usecases, I haven't observed any performance difference at all between 24 and 16 byte View (reactor.nim uses View from collections.nim).

@dom96
Copy link
Contributor

dom96 commented Oct 23, 2017

I would prefer the name Slice but I guess that is already taken.

Otherwise I am going to delegate this to @Araq, I'm mostly happy to include this though.


ByteView* = View[byte]

proc initView*[T; R: ref](data: ptr T, len: int, gcKeep: R): View[T] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that any function which accepts a 'ptr' should have an unsafe prefix.

@@ -0,0 +1,148 @@
## View is a type representing a range of elements in an array. In can be thought as a pointer plus a size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In -> It

## View is a type representing a range of elements in an array. In can be thought as a pointer plus a size.
## View can be created to an arbitrary memory segment and can additionally keep a single ``ref`` object alive.
##
## This module defines views and several helper operations on them. All functions in this module (except for unsafe version of ``initView``) are designed to be memory safe (e.g. they raise exception on out-of-bounds accesses).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples in the docs would be nice.

else:
initView(addr s[0], s[].len, gcKeep=s)

proc newView*[T](s: seq[T]): View[T] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't follow Nim's convention.

## Creates a view of length zero.
return initView[T](nil, 0)

proc initView[T](s: ref seq[T]): View[T] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be var seq[T]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't be memory safe. (You can use initView(ptr T, int) for that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you use GC_ref to make it safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But I think then it needs to allocate object to make it possible to call GC_unref in destructor.

result = initView(s)

proc isNil*(v: View): bool =
return v.len == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather misleading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how can you use 'len' here when it's defined below this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View is generic and len has multiple overloads, so it's mixin by default.

Copy link
Contributor

@krux02 krux02 Oct 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, a View is never Nil, just empty and therefore this isNil should be removed.

@Araq
Copy link
Member

Araq commented Jan 11, 2018

Stylistic issues aside, it's mostly fine but I think it would be ill advised to introduce this. Once openArray and var openArray work with some minor Rust inspired borrow checking to make it safe (RFC soon to be written) there would be considerable overlap with this stdlib module.

@zielmicha
Copy link
Contributor Author

zielmicha commented Jan 11, 2018

It's great the hear that the are plans to create first-class openArray. Then I think we can close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants