-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor indexing #54
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -81,7 +81,13 @@ def [](*key) | |
|
||
slice first, last | ||
when key.size > 1 | ||
Daru::Index.new key.map { |k| self[k] } | ||
if include? key[0] | ||
Daru::Index.new key.map { |k| k } | ||
else | ||
# Assume the user is specifing values for index not keys | ||
# Return index object having keys corresponding to values provided | ||
Daru::Index.new key.map { |k| key k } | ||
end | ||
else | ||
v = @relation_hash[loc] | ||
return loc if v.nil? | ||
|
@@ -143,6 +149,14 @@ def self._load data | |
|
||
Daru::Index.new(h[:relation_hash].keys) | ||
end | ||
|
||
# Provide an Index for sub vector produced | ||
# | ||
# @param input_indexes [Array] the input by user to index the vector | ||
# @return [Object] the Index object for sub vector produced | ||
def conform input_indexes | ||
self | ||
end | ||
end # class Index | ||
|
||
class MultiIndex < Index | ||
|
@@ -214,7 +228,7 @@ def [] *key | |
case | ||
when key[0].is_a?(Range) then retrieve_from_range(key[0]) | ||
when (key[0].is_a?(Integer) and key.size == 1) then try_retrieve_from_integer(key[0]) | ||
else retrieve_from_tuples(key) | ||
else retrieve_from_tuples key | ||
end | ||
end | ||
|
||
|
@@ -236,7 +250,7 @@ def retrieve_from_tuples key | |
chosen = find_all_indexes label, level_index, chosen | ||
end | ||
|
||
return chosen[0] if chosen.size == 1 | ||
return chosen[0] if chosen.size == 1 and key.size == @levels.size | ||
return multi_index_from_multiple_selections(chosen) | ||
end | ||
|
||
|
@@ -330,5 +344,14 @@ def values | |
def inspect | ||
"Daru::MultiIndex:#{self.object_id} (levels: #{levels}\nlabels: #{labels})" | ||
end | ||
|
||
# Provide a MultiIndex for sub vector produced | ||
# | ||
# @param input_indexes [Array] the input by user to index the vector | ||
# @return [Object] the MultiIndex object for sub vector produced | ||
def conform input_indexes | ||
return self if input_indexes[0].is_a? Range | ||
drop_left_level input_indexes.size | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can see that you've used this method to remove the left most level on this line and have kept it here for a uniform interface between indexes. But according to the previous functionality, the number of levels that were removed were equal to the number of levels that were specified by the user. For example: d = Daru::MultiIndex.from_tuples([
[:c,:one,:bar],
[:c,:one,:baz],
[:c,:two,:foo],
[:c,:two,:bar]
])
#=> Daru::MultiIndex:96634260 (levels: [[:c], [:one, :two], [:bar, :baz, :foo]]
#labels: [[0, 0, 0, 0], [0, 0, 1, 1], [0, 1, 2, 0]])
v = Daru::Vector.new([1,2,3,4], index: d)
#<Daru::Vector:101434370 @name = nil @size = 4 >
# nil
#[:c, :one, :bar] 1
#[:c, :one, :baz] 2
#[:c, :two, :foo] 3
#[:c, :two, :bar] 4
v[:c, :one]
#<Daru::Vector:101291480 @name = nil @size = 2 >
# nil
#[:bar] 1
#[:baz] 2
v[:c]
#=>
#<Daru::Vector:101176290 @name = nil @size = 4 >
# nil
#[:one, :bar] 1
#[:one, :baz] 2
#[:two, :foo] 3
#[:two, :bar] 4 I don't quite understand how the correct number of levels are being dropped in spite of you nowhere specifying the number of levels the user has asked for. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
[17] pry(main)> d = Daru::MultiIndex.from_tuples([[:a, :one], [:a, :two]])
=> Daru::MultiIndex:20963520 (levels: [[:a], [:one, :two]]
labels: [[0, 0], [0, 1]])
[5] pry(main)> d.factor_out
=> Daru::MultiIndex:20450720 (levels: [[:one, :two]]
labels: [[0, 1]]) Here [21] pry(main)> d = Daru::MultiIndex.from_tuples([[:a, :one], [:b, :two]])
=> Daru::MultiIndex:22630440 (levels: [[:a, :b], [:one, :two]]
labels: [[0, 1], [0, 1]])
[22] pry(main)> v = Daru::Vector.new([1,2], index: d)
=>
#<Daru::Vector:22281380 @name = nil @size = 2 >
nil
[:a, :one] 1
[:b, :two] 2
[23] pry(main)> v[:a]
=> 1 Notice, it never showed the I checked out the previous functionality and it too didn't showed If I want to correct this inconsistency I think I would need to pass arguments to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes you will absolutely have to pass the number of levels to be dropped because when a user specifies a given number of indexes for a MultiIndex vector/dataframe, only those many levels should be dropped. Your previous approach worked mainly because the specs were so written. What if the levels were like this: levels: [[:a], [:one], [:one, :two, :three, :four]] And I have specified only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And about showing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @v0dro No, they don't follow the same convention. For example: >>> index = pd.MultiIndex.from_tuples([['a', 'one'], ['b', 'two']], names=['first', 'second'])
>>> s = pd.Series(np.random.randn(8), index=index)
>>> s
first second
a one 1.387915
b two -1.169805
dtype: float64
>>> s['a']
second
one 1.387915
dtype: float64 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah. Port the pandas functionality then. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just it fixed it. Now it's showing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes absolutely. |
||
end | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -455,7 +455,7 @@ | |
end | ||
|
||
it "returns a Vector if the last level of MultiIndex is tracked" do | ||
expect(@df_mi[:a, :one]).to eq( | ||
expect(@df_mi[:a, :one, :bar]).to eq( | ||
Daru::Vector.new(@vector_arry1, index: @multi_index)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It wasn't tracking the last level. |
||
end | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -76,7 +76,7 @@ | |
context "#[]" do | ||
before do | ||
@id = Daru::Index.new [:one, :two, :three, :four, :five, :six, :seven] | ||
@mixed_id = Daru::Index.new ['a','b','c',:d,:a,0,3,5] | ||
@mixed_id = Daru::Index.new ['a','b','c',:d,:a,8,3,5] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any particular reason why you replaced the 0 with 8? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright. But keep in mind that if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you give an example? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Example 1. Notice that since [18] pry(main)> a = Daru::Vector.new([1,2,3,4], index: [2,3,0,1])
=>
#<Daru::Vector:88293040 @name = nil @size = 4 >
nil
2 1
3 2
0 3
1 4
[19] pry(main)> a[0]
=> 3 Example 2. Here 0 isn't present in the index so it returns the 0th element: [24] pry(main)> b = Daru::Vector.new([4,3,5,9], index: [:a, :b, :c, :d])
=>
#<Daru::Vector:87873110 @name = nil @size = 4 >
nil
a 4
b 3
c 5
d 9
[25] pry(main)> b[0]
=> 4 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My code is good with these tests! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No problem then :) |
||
end | ||
|
||
it "works with ranges" do | ||
|
@@ -88,12 +88,12 @@ | |
expect(@mixed_id[0..2]).to eq(Daru::Index.new(['a','b','c'])) | ||
|
||
# If atleast one is a number then refer to actual indexing | ||
expect(@mixed_id.slice('b',0)).to eq(Daru::Index.new(['b','c',:d,:a,0])) | ||
expect(@mixed_id.slice('b',8)).to eq(Daru::Index.new(['b','c',:d,:a,8])) | ||
end | ||
|
||
it "returns multiple keys if specified multiple indices" do | ||
expect(@id[0,1,3,4]).to eq(Daru::Index.new([0,1,3,4])) | ||
expect(@mixed_id[0,5,3,2]).to eq(Daru::Index.new([5, 7, 6, 2])) | ||
expect(@id[0,1,3,4]).to eq(Daru::Index.new([:one, :two, :four, :five])) | ||
expect(@mixed_id[0,5,3,2]).to eq(Daru::Index.new(['a', 8, :d, 'c'])) | ||
end | ||
|
||
it "returns correct index position for non-numeric index" do | ||
|
@@ -102,7 +102,7 @@ | |
end | ||
|
||
it "returns correct index position for mixed index" do | ||
expect(@mixed_id[0]).to eq(5) | ||
expect(@mixed_id[8]).to eq(5) | ||
expect(@mixed_id['c']).to eq(2) | ||
end | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -177,11 +177,12 @@ | |
[:c,:one,:bar], | ||
[:c,:one,:baz], | ||
[:c,:two,:foo], | ||
[:c,:two,:bar] | ||
[:c,:two,:bar], | ||
[:d,:one,:foo] | ||
] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Had to add |
||
@multi_index = Daru::MultiIndex.from_tuples(@tuples) | ||
@vector = Daru::Vector.new( | ||
Array.new(12) { |i| i }, index: @multi_index, | ||
Array.new(13) { |i| i }, index: @multi_index, | ||
dtype: dtype, name: :mi_vector) | ||
end | ||
|
||
|
@@ -211,6 +212,12 @@ | |
dtype: dtype, name: :sub_sub_vector)) | ||
end | ||
|
||
it "returns sub vector not a single element when passed the partial tuple" do | ||
mi = Daru::MultiIndex.from_tuples([[:foo]]) | ||
expect(@vector[:d, :one]).to eq(Daru::Vector.new([12], index: mi, | ||
dtype: dtype, name: :sub_sub_vector)) | ||
end | ||
|
||
it "returns a vector with corresponding MultiIndex when specified numeric Range" do | ||
mi = Daru::MultiIndex.from_tuples([ | ||
[:a,:two,:baz], | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@v0dro I am on my way to improve the indexing infrastructure. I have made a major change in Index behavior. I think an example will illustrate it best:
Earlier it used to be:
The motive behind this is to remove the frequent checking of type of index and moving the functionality of guessing that the user is perhaps giving the index values (not keys) from
Vector
class toIndex
class because it seems more inherent to index.Are you good with this?
If you are good with this then I plan to the same with
MultiIndex
and finally remove the conditionals.