Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamically resize the table matrix on set_value #233

Merged
merged 14 commits into from Nov 6, 2020
Merged

dynamically resize the table matrix on set_value #233

merged 14 commits into from Nov 6, 2020

Conversation

hiaselhans
Copy link
Contributor

@hiaselhans hiaselhans commented Nov 3, 2020

I will conform to pr checklist later.

here is a use case:

sheet=pyexcel.Sheet()
sheet[1,1] = "JO"
sheet["AA1"] = "test"

that leads to an index error.
I think it is a valid use case to set arbitrary cells at random positions?

This pr adds arguments to uniform so we can make an array of min_width and min_height


---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-df6d362a346f> in <module>
      1 sheet=pyexcel.Sheet()
      2 sheet[1,1] = "JO"
----> 3 sheet["AA1"] = "test"

~/dev/pyexcel/pyexcel/sheet.py in __setitem__(self, aset, c)
    607             self.cell_value(row, column, c)
    608         else:
--> 609             Matrix.__setitem__(self, aset, c)
    610 
    611     def __len__(self):

~/dev/pyexcel/pyexcel/internal/sheets/matrix.py in __setitem__(self, aset, cell_value)
    467         elif isinstance(aset, str):
    468             row, column = utils.excel_cell_position(aset)
--> 469             return self.cell_value(row, column, cell_value)
    470         else:
    471             raise IndexError

~/dev/pyexcel/pyexcel/internal/sheets/matrix.py in cell_value(self, row, column, new_value)
     94                 raise IndexError("Index out of range")
     95             else:
---> 96                 self.paste((row, column), [[new_value]])
     97 
     98     def row_at(self, index):

~/dev/pyexcel/pyexcel/internal/sheets/matrix.py in paste(self, topleft_corner, rows, columns)
    405         """
    406         if rows:
--> 407             self._paste_rows(topleft_corner, rows)
    408         elif columns:
    409             self._paste_columns(topleft_corner, columns)

~/dev/pyexcel/pyexcel/internal/sheets/matrix.py in _paste_rows(self, topleft_corner, rows)
    427             set_index = starting_row + index
    428             if set_index < number_of_rows:
--> 429                 self._set_row_at(set_index, row, starting=topleft_corner[1])
    430             else:
    431                 real_row = [constants.DEFAULT_NA] * topleft_corner[1] + row

~/dev/pyexcel/pyexcel/internal/sheets/matrix.py in _set_row_at(self, row_index, data_array, starting)
    153             self.__width, self.__array = uniform(self.__array)
    154         else:
--> 155             raise IndexError(constants.MESSAGE_INDEX_OUT_OF_RANGE)
    156 
    157     def _extend_row(self, row):

With your PR, here is a check list:

  • Has test cases written?
  • Has all code lines tested?
  • Has make format been run?
  • Please update CHANGELOG.yml(not CHANGELOG.rst)
  • Passes all Travis CI builds
  • Has fair amount of documentation if your change is complex
  • Agree on NEW BSD License for your contribution

@chfw
Copy link
Member

chfw commented Nov 3, 2020

Yep, pyexcel has not explored random access, which restrict use cases like yours.

I recommend you have a closer look at paste function, which I explicitly forbid it exceeding the original dimension. You may find a way to relax it.

@hiaselhans
Copy link
Contributor Author

i see that, the restriction is in _set_row_at.

so, given that there is an expensive call to uniform() in that function too, is there something wrong with my approach or is it just unintended behavior to allow random inserts?

@chfw
Copy link
Member

chfw commented Nov 3, 2020

I could be biased. It seems uniform function will also grow the two dimensional array, plus the original intent. Could the growth function be separated?

As to other side effect, I don’t think so. Current constraints were kept because personally I do not have a strong use case to support it. I wrote a lot of useless functions in pyexcel in the beginning.

So, as long as your change would pass the existing unit tests, your PR with unit tests for new code are good enough to be released.

@chfw
Copy link
Member

chfw commented Nov 3, 2020

‘make format’ will do automatically code formatting.

@codecov-io
Copy link

codecov-io commented Nov 3, 2020

Codecov Report

Merging #233 into dev will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #233      +/-   ##
==========================================
- Coverage   98.35%   98.32%   -0.03%     
==========================================
  Files         103      104       +1     
  Lines        6732     6763      +31     
==========================================
+ Hits         6621     6650      +29     
- Misses        111      113       +2     
Impacted Files Coverage Δ
pyexcel/internal/sheets/matrix.py 97.61% <100.00%> (-0.87%) ⬇️
tests/base.py 95.63% <100.00%> (+0.13%) ⬆️
tests/test_sheet_access.py 100.00% <100.00%> (ø)
pyexcel/sheet.py 96.01% <0.00%> (+0.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82be52b...e1c24a7. Read the comment docs.

@chfw
Copy link
Member

chfw commented Nov 3, 2020

Good, it means there is no regression caused.

Please write the unit tests which will protect your use case in the future:

  1. test(s) that prove your use case works correct
  2. exploratory tests if any

In the long run, the unit tests you contributed will live longer than the code you wrote.

In essence, your use case will be kept alive as long as your unit tests will be kept. The non-test codes you contributed are mortal.

@hiaselhans
Copy link
Contributor Author

thx @chfw !

I will prepare this pr right and add tests.
To my understanding we don't have any additional overhead introduced by this.

@chfw
Copy link
Member

chfw commented Nov 4, 2020

As a side topic, TDD would advise you to write unit test first then write the code.

@hiaselhans
Copy link
Contributor Author

sure!

I am happy to comply with your rules (even 80char limits) and contribute more, if you are fine with human beings like me having different ways of solving their problems :)

Shall i add a howto contribute and missing commands from pr template to ease the path for future contributors?

@chfw
Copy link
Member

chfw commented Nov 4, 2020

You are welcome to contribute more.

Regarding contributor guide and PR template, you can find them here: https://github.com/pyexcel/pyexcel-mobans/tree/master/templates. Once your changes is committed there, moban command will semi-automatically reflect your changes in pyexcel and pyexcel's other repos

@hiaselhans
Copy link
Contributor Author

thanks @chfw .

i meant to make a guide on tooling / requirements for pre-commit checks...

however thats a different story and i think all issues with this pr are resolved.
codecov should be increased after all...

@chfw
Copy link
Member

chfw commented Nov 4, 2020

Precommit sounds cool. I never had the time to work it out.

from base import SheetBaseTestcase


class SheetAccessTest(SheetBaseTestcase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you come from OOP background!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time I make a class, I will ask myself: will I use any of the OOP features? If answer is NO, I will not use use classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not take it as criticism. Developers have their own view. I would like to hear more from you. That is an exchange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, i'm not coming from the java world if that's what you mean... ;)

If you have a hard opinion about it i can revert it to pure functions, but especially in unit tests i liked the idea of having a common base class. For grouping as well as for the common setUp and tearDown functions. I saw you were using them quite frequently too...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am opinionated when it comes to Java. “Thinking in Java” book had a a whole chapter: everything is object, hence Java developers cannot forget it when they write Python code.

Surely if you use OOP features, then a class will make more sense. What I am object to make a class for the sake of OOP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

i added another test to raise the coverage, and make the point of having a class more clear ;)
Don't know why codecov still fails. any idea?

@hiaselhans
Copy link
Contributor Author

well. shame on me. the regex seems to be (Test*)

fixed that. now theres a ton of commits but you can squash them if you want..

self.paste((row, column), [[new_value]])
else:
if not fit:
width, array = uniform(self.__array, row+1, column+1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here, it is supposed that row > number of rows and column > number of rows.

what happens if:

  1. row < number of rows but column > number of rows
  2. row > number of rows but column < number of rows

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is good to add two tests to test above two conditions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added those two edge cases as test, but to me it looks pretty clear:

fit = row < row_num and column < column_num
row < row_num, column > column_num => True and False => fit = false
row > row_num, column < coulmn_num => False and True => fit = false

@chfw
Copy link
Member

chfw commented Nov 5, 2020

the PR looks good now. Let's see about the 'extra miles' if you could extend the unit tests a bit.

@chfw chfw merged commit 9b54557 into pyexcel:dev Nov 6, 2020
2 of 3 checks passed
@chfw
Copy link
Member

chfw commented Nov 6, 2020

da3d682

This was referenced Nov 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants