Skip to content

Conversation

DilumAluthge
Copy link
Member

@DilumAluthge DilumAluthge commented Dec 29, 2020

Fixes #9

@codecov
Copy link

codecov bot commented Dec 29, 2020

Codecov Report

Merging #10 (88815d0) into master (72b511a) will increase coverage by 21.21%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #10       +/-   ##
===========================================
+ Coverage   77.87%   99.09%   +21.21%     
===========================================
  Files           7        9        +2     
  Lines         113      110        -3     
===========================================
+ Hits           88      109       +21     
+ Misses         25        1       -24     
Impacted Files Coverage Δ
src/Octavian.jl 100.00% <ø> (ø)
src/macrokernels.jl 90.90% <ø> (ø)
src/types.jl 100.00% <ø> (+55.55%) ⬆️
src/block_sizes.jl 100.00% <100.00%> (+6.25%) ⬆️
src/memory_buffer.jl 100.00% <100.00%> (ø)
src/pointer_matrix.jl 100.00% <100.00%> (ø)
src/utils.jl 100.00% <100.00%> (+28.57%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 72b511a...88815d0. Read the comment docs.

@DilumAluthge
Copy link
Member Author

@chriselrod In particular, can you take a look at my refactoring of src/block_sizes.jl , and make sure that everything is still correct there?

@DilumAluthge DilumAluthge marked this pull request as ready for review December 29, 2020 20:29
@DilumAluthge DilumAluthge force-pushed the dpa/add-tests branch 2 times, most recently from 261e754 to 854f6d9 Compare December 29, 2020 20:38
Copy link
Collaborator

@chriselrod chriselrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring block_sizes relies on constant propagation for type stability.
I would definitely either add @inferred tests to make sure it is still type stable, or redefine _L3 and _L2 inside _calculate_L3.

end

function _calculate_L3(_L2, L2c, _L3, L3c, st)
if (2 * _L2 * L2c) > (_L3 * L3c)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use

if VectorizationBase.CACHE_INCLUSIVITY[3]
    L3 = (StaticInt{_L3}() - StaticInt{_L2}()) ÷ st
else
    L3 = StaticInt{_L3}() ÷ st
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring block_sizes relies on constant propagation for type stability.

Oof. I'd rather not risk it. How about:

  1. For now, we revert the refactoring of block_sizes.
  2. In order to get code coverage in both paths of this if statement, we work on implementing an environment variable override feature in VectorizationBase.jl?

Can you undo the block_size changes and push to this branch? If not, I can do it when I'm back at a computer.

Copy link
Collaborator

@chriselrod chriselrod Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that on my local master, where I made the switch to use VectorizationBase.CACHE_INCLUSIVITY[3] as the check:

julia> @code_typed Octavian.block_sizes(Float64)
CodeInfo(
1nothing::Nothingnothing::Nothing
└──     goto #3 if not false
2nothing::Nothing
3nothing::Nothing
└──     return (Static(96), Static(1173), Static(954))
) => Tuple{StaticInt{96}, StaticInt{1173}, StaticInt{954}}

julia> foo() = Octavian.block_sizes(Float64)
foo (generic function with 1 method)

julia> @code_typed foo()
CodeInfo(
1return (Static(96), Static(1173), Static(954))
) => Tuple{StaticInt{96}, StaticInt{1173}, StaticInt{954}}

blcok_sizes are entirely calculated at compile time / all the code gets compiled away.
So make sure we don't lose that in any refactoring!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else you could do is call _calculate_L3 with StaticInts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it's safer to leave everything in one function (like it is on master). That way, the compiler has the most ability to do optimizations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggested fix in the other comment should work perfectly well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested locally:

julia> @code_typed Octavian.block_sizes(Float64)
CodeInfo(
1nothing::Nothingnothing::Nothingnothing::Nothing
└──     return (Static(96), Static(1173), Static(954))
) => Tuple{StaticInt{96}, StaticInt{1173}, StaticInt{954}}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly, I don't trust constant propagation to work reliably and not cause type inference problems.
This is the point of StaticInts: they're compile-time constants.

So as long as creating one can be inferred, i.e. because the number is constant at the location you're define the StaticInt, you can pass them around without issue and have things work/be compile time constants.

else
L3 = (StaticInt{_L3}() - StaticInt{_L2}()) ÷ st
end
L3 = _calculate_L3(_L2, L2c, _L3, L3c, st)
Copy link
Collaborator

@chriselrod chriselrod Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use

L3 = _calculate_L3(StaticInt{_L2}(), StaticInt{_L3}(), st)

here, and then define

function _calculate_L3(_L2, _L3, st)
    if VectorizationBase.CACHE_INCLUSIVITY[3]
        (_L3 - _L2) ÷ st
    else
        _L3 ÷ st
    end
end

Copy link
Collaborator

@chriselrod chriselrod Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer relies on constant propagation, since we convert to StaticInts before calling _calculate_L3.

@DilumAluthge DilumAluthge marked this pull request as draft December 29, 2020 22:09
@DilumAluthge DilumAluthge marked this pull request as ready for review December 29, 2020 22:21
@DilumAluthge DilumAluthge marked this pull request as draft December 29, 2020 22:55
@DilumAluthge
Copy link
Member Author

@chriselrod Take a look now, I think everything is fixed.

This is what I get:

julia> @code_typed Octavian.block_sizes(Float64)
CodeInfo(
1nothing::Nothingnothing::Nothing
└──     goto #3 if not false
2nothing::Nothing
3return (Static(72), Static(284), Static(984))
) => Tuple{ArrayInterface.StaticInt{72}, ArrayInterface.StaticInt{284}, ArrayInterface.StaticInt{984}}

julia> @code_typed Octavian.block_sizes(Float32)
CodeInfo(
1nothing::Nothingnothing::Nothing
└──     goto #3 if not false
2nothing::Nothing
3return (Static(144), Static(284), Static(1974))
) => Tuple{ArrayInterface.StaticInt{144}, ArrayInterface.StaticInt{284}, ArrayInterface.StaticInt{1974}}

julia> @code_typed Octavian.block_sizes(Int64)
CodeInfo(
1nothing::Nothingnothing::Nothing
└──     goto #3 if not false
2nothing::Nothing
3return (Static(72), Static(284), Static(984))
) => Tuple{ArrayInterface.StaticInt{72}, ArrayInterface.StaticInt{284}, ArrayInterface.StaticInt{984}}

julia> @code_typed Octavian.block_sizes(Int32)
CodeInfo(
1nothing::Nothingnothing::Nothing
└──     goto #3 if not false
2nothing::Nothing
3return (Static(144), Static(284), Static(1974))
) => Tuple{ArrayInterface.StaticInt{144}, ArrayInterface.StaticInt{284}, ArrayInterface.StaticInt{1974}}

@DilumAluthge DilumAluthge marked this pull request as ready for review December 29, 2020 23:13
@DilumAluthge
Copy link
Member Author

@chriselrod I think this is good to go, so I'm merging. When you get a chance, can you double-check that the typed code looks correct?

@DilumAluthge DilumAluthge merged commit 789882f into master Dec 29, 2020
@DilumAluthge DilumAluthge deleted the dpa/add-tests branch December 29, 2020 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Combination of @time and @testset breaks getindex for PointerMatrix
2 participants