Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support overlapping tiles #23

Merged
merged 21 commits into from
Oct 22, 2020
Merged

Support overlapping tiles #23

merged 21 commits into from
Oct 22, 2020

Conversation

jw3126
Copy link
Contributor

@jw3126 jw3126 commented Oct 13, 2020

Close #22
This needs at least docs and more tests. But before I go into that, @johnnychen94 what do you think about the API/design of this PR?

@codecov
Copy link

codecov bot commented Oct 13, 2020

Codecov Report

Merging #23 into master will decrease coverage by 2.89%.
The diff coverage is 90.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #23      +/-   ##
==========================================
- Coverage   96.00%   93.10%   -2.90%     
==========================================
  Files           1        2       +1     
  Lines          75      116      +41     
==========================================
+ Hits           72      108      +36     
- Misses          3        8       +5     
Impacted Files Coverage Δ
src/TiledIteration.jl 96.22% <ø> (+0.22%) ⬆️
src/tileiterator.jl 90.47% <90.47%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8dcfb86...af9bd17. Read the comment docs.

@johnnychen94
Copy link
Member

johnnychen94 commented Oct 14, 2020

As a prototype, there's no doubt that this version works for your purpose. I have thought about the enhancement of this package but haven't dig into all the possibilities and implementation details. Here are some of my thoughts; they are not what we must do, but I'd like to see things head approximately in that direction.


Codes in the current PR version seems a bit coupled together and might be very hard to extend. If it is possible, I feel it would be better and cleaner to implement a new Range type for the Balanced version, and reuse StepRange for the Fixed version. And keep the previous struct type as possible as we can.

struct TileIterator{N,C} <: AbstractArray{NTuple{N, UnitRange{Int}}, N}
    covers1d::C
end

IIUC covers1d here is a tuple of vectors, which isn't great in performance. We should make this vector lazily generated. I think that's why the previous struct type is designed that way.


Generally, I don't think it's a good direction to make it an array type. For simple cases like this, we could have efficient getindex implementation, but that's not always possible for other cases. Make it an iterator would be more efficient and extensible, but yes, it requires some effort to implement it.


I'm not sure if we should add stride keyword here, I feel it'd be better to

TileIterator(axes::Indices, ::IterationStrategy)

size and stride are just meta info to iteration strategy. I prefer to let TileIterator to normalize the axes input/output and let IterationStrategy to handle the dirty and complex iteration implementation.

Does every strategy has a stride meta? I don't know, I don't feel it good to assume that. We assume iteration along the column direction here, it might be good to assume that we also want to support iteration along row direction in the future. How would we solve that, adding a new columnmajor=true keyword? Probably, but that would exponentially increase the complexity.

In general, I want to get to a design where iteration strategies are pluggable and composable.

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 14, 2020

Thanks a lot for your comments. I agree with most of what you said, but have a different view on some points:

Codes in the current PR version seems a bit coupled together and might be very hard to extend. If it is possible, I feel it would be better and cleaner to implement a new Range type for the Balanced version, and reuse StepRange for the Fixed version. And keep the previous struct type as possible as we can.

struct TileIterator{N,C} <: AbstractArray{NTuple{N, UnitRange{Int}}, N}
    covers1d::C
end

IIUC covers1d here is a tuple of vectors, which isn't great in performance. We should make this vector lazily generated. I think that's why the previous struct type is designed that way.

There seems to be a misunderstanding covers1d is a tuple of AbstactVector{UnitRange{Int}}. It can be a tuple of Vector, but that is not the default. The default is lazy. Either StepRange or QuantizedRange in the balanced case. I think this is very flexible and easy to extend. I did not look into performance yet.
I could imagine, that a second field is needed for faster iteration, but I would like to avoid that if possible.

struct TileIterator{...}
covers1d::C
_fast_iteration_implementation_detail::F
end

Generally, I don't think it's a good direction to make it an array type. For simple cases like this, we could have efficient getindex implementation, but that's not always possible for other cases. Make it an iterator would be more efficient and extensible, but yes, it requires some effort to implement it.

I think in the above situation it is very natural to make an AbstractArray and getindex can be fast. And I think it will be easy for any tiling that is a product of 1d cases. If we have a case that is not a product, it should use another type. Something like

struct ExoticTileIterator <: NotAnAbstractArray
...
end

Maybe we should use tileiterator(axes, strategy) and it will create a ProductTiling <: AbstractArray or MyExoticTiling <: NotAnArray.

I'm not sure if we should add stride keyword here, I feel it'd be better to

TileIterator(axes::Indices, ::IterationStrategy)

size and stride are just meta info to iteration strategy. I prefer to let TileIterator to normalize the axes input/output and let IterationStrategy to handle the dirty and complex iteration implementation.

Excellent point I agree.

Does every strategy has a stride meta? I don't know, I don't feel it good to assume that. We assume iteration along the column direction here, it might be good to assume that we also want to support iteration along row direction in the future. How would we solve that, adding a new columnmajor=true keyword? Probably, but that would exponentially increase the complexity.

I fully agree here, is better to go with your ::IterationStrategy suggestion.

In general, I want to get to a design where iteration strategies are pluggable and composable.

Again I fully agree. I am however not sure, what pluggable and composable mean here concretely.

@johnnychen94
Copy link
Member

I'm a little bit with my own work here and I'm sorry that I can't give any detailed feedback on the code at this moment. It looks like you clearly know what should be done. My major concern about this PR and the changes is the performance overhead and extensibility. If that's done and I'm good with that.


IIUC covers1d here is a tuple of vectors, which isn't great in performance. We should make this vector lazily generated. I think that's why the previous struct type is designed that way.

There seems to be a misunderstanding covers1d is a tuple of AbstactVector{UnitRange{Int}}. It can be a tuple of Vector, but that is not the default. The default is lazy. Either StepRange or QuantizedRange in the balanced case. I think this is very flexible and easy to extend. I did not look into performance yet.

Hmmm, I don't quite get it here because the constructor tells me this.

function TileIterator(covers1d::NTuple{N, AbstractVector{UnitRange{Int}}}) where {N}
    C = typeof(covers1d)
    return TileIterator{N, C}(covers1d)
end

This representation might bring some overhead and I'd like to avoid that overhead. I probably am wrong here but still, a performance benchmark is needed to support this change.

Maybe we should use tileiterator(axes, strategy) and it will create a ProductTiling <: AbstractArray or MyExoticTiling <: NotAnArray.

Yeah, perhaps. It seems like this doesn't need to be included in this PR.

In general, I want to get to a design where iteration strategies are pluggable and composable.
Again I fully agree. I am however not sure, what pluggable and composable mean here concretely.

By composable I mean two iteration strategy can be composed together into another strategy. For example, Fixed() ∘ RowMajor() or something like this. Again, this doesn't need to be included in this PR but it would be better that we consider this possibility and extensibility.

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 15, 2020

Hmmm, I don't quite get it here because the constructor tells me this.

function TileIterator(covers1d::NTuple{N, AbstractVector{UnitRange{Int}}}) where {N}
    C = typeof(covers1d)
    return TileIterator{N, C}(covers1d)
end

This representation might bring some overhead and I'd like to avoid that overhead. I probably am wrong here but still, a performance benchmark is needed to support this change.

This constructor is a bit low level. If you ask by whatever API for the balanced case, you would get

typeof(covers1d) = Tuple{CoveredRange{QuantizedRange....

which is a bitstype and completly lazy. Did you worry because you thought e.g. this would allocate a Vector or are you worrying that the compiler does not properly inline away all the abstraction?

@johnnychen94
Copy link
Member

johnnychen94 commented Oct 15, 2020

Ha, I get your point here; didn't realize that CoveredRange <: AbstractVector. I was worried about the additional overhead during getindex(::Vector, i) but since this is lazily generated, it looks good.

I was worried about the overhead during iteration, for example:

julia> titr = TileIterator((1:7, 3:6), tilesize=(2, 2), stride=Balanced((3, 2)));

julia> @btime getindex(titr, 2, 2)
  13.920 ns (0 allocations: 0 bytes)
(3:4, 5:6)

julia> titr = TileIterator((1:7, 3:6), tilesize=(2, 2), stride=Fixed((3, 2)));

julia> @btime getindex(titr, 2, 2)
  0.045 ns (0 allocations: 0 bytes)
(4:5, 5:6)

great that this has zero allocation. So yeah, this implementation is quite efficient already. Perhaps there's still some room to get getindex for Balanced faster; I'm a bit curious why it's slower here.

The time spent on construction is less important here since we generally don't need to repeatedly create TileIterators; it's still worth improving if that's not too hard.


This looks good to me now, a really nice play with the types and dispatches. I guess we just need to leave some room for future extensibility as we discussed. Then add docs and tests and I feel it could be good to merge.


# strategies
export RelaxStride
struct RelaxStride{N}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought, lets be conservative for now. I just added these two strategies. They are enough to cover the funcionality on master as well as my balanced tile size use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A different PR could implement a strategy that sets the stride to (1,1,1...) giving the behavior sketched here:
JuliaImages/ImageFiltering.jl#155 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more advanced strides, we might want to do
tileiterator(axes, UnitStride())[begin:2:end, :, begin+2:4:end-2]

Copy link
Member

@johnnychen94 johnnychen94 Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works but might not be the most efficient way because indexing with [begin:2:end, :, begin+2:4:end-2] allocates a new array.

No need to implement in this PR, though. It is also very likely that reducing memory allocation only gives a marginal performance boost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right this allocates currently, we can fix it, when it becomes a practical obstacle.

@johnnychen94
Copy link
Member

johnnychen94 commented Oct 20, 2020

Probably we could just switch to use Travis for the Windows platform and sunset appveyor so that we don't get troubled with 32bit doctest.

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 20, 2020

I did remove appveyor.yml and travis passes on windows. Probably need to deactivate appveyor on the appveyor website or something?

Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version looks good to me; probably the last round review. More docs and tests might be helpful, though.

src/tileiterator.jl Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Show resolved Hide resolved
@johnnychen94
Copy link
Member

I did remove appveyor.yml and travis passes on windows. Probably need to deactivate appveyor on the appveyor website or something?

Will remove that once this PR is merged.

Copy link
Member

@johnnychen94 johnnychen94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think it's good to merge once the test passes. This is not a breaking change, but I'll bump to v0.3.0 since it drops Julia v0.7 support.

ping @timholy in case he has some thoughts about this PR.

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 20, 2020

Thanks @johnnychen94 for the detailed review and feedback.

@timholy
Copy link
Member

timholy commented Oct 20, 2020

I've got this on my calendar to review tomorrow, sorry for the delay.

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great! In the future we may also want to move some of the tile-padding logic from ImageFiltering here, and this will be a good foundation.

src/TiledIteration.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Show resolved Hide resolved
src/tileiterator.jl Show resolved Hide resolved
src/tileiterator.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Outdated Show resolved Hide resolved
src/tileiterator.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
Co-authored-by: Tim Holy <tim.holy@gmail.com>
jw3126 and others added 6 commits October 21, 2020 13:20
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Tim Holy <tim.holy@gmail.com>
@timholy
Copy link
Member

timholy commented Oct 22, 2020

There may be something weird going on. E.g., 75b2a90 doesn't match the commit-summary.

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 22, 2020

Yeah I messed something up. I think we can just squash this PR or should I try to clean the history?

@timholy
Copy link
Member

timholy commented Oct 22, 2020

I'll squash, no worries about that.

@timholy timholy merged commit 5808ecf into JuliaArrays:master Oct 22, 2020
@timholy
Copy link
Member

timholy commented Oct 22, 2020

Thanks again, terrific to have this!

@timholy
Copy link
Member

timholy commented Oct 22, 2020

There's nothing breaking about this, right?

Seems a little weird to release such a major improvement as 0.2.6, but OTOH not having to do version bumps in dependencies has advantages.

@timholy
Copy link
Member

timholy commented Oct 22, 2020

Packages that would need [compat] bumps are just ImageFiltering, ImageMorphology, and Images (of those in General, anyway). Not bad at all. If we do want to increase the version number, perhaps we should contemplate 1.0?

@jw3126
Copy link
Contributor Author

jw3126 commented Oct 22, 2020

Yeah, I think this is not breaking. I would go with 0.2.6. For 1.0 I think we might want to gather some more experience with the TileIterator(axes, strategy) API.

@johnnychen94
Copy link
Member

The only reason to bump to v0.3.0 is that Julia 0.7 support is dropped here.

@timholy
Copy link
Member

timholy commented Oct 22, 2020

Oh right, we'll have to bump in fact. OK, let's release this as 0.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Uniform tiling" in scope
3 participants