New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Even if the format is a string it is displayed as a float #29

Closed
Navis-Raven opened this Issue Apr 13, 2016 · 34 comments

Comments

Projects
None yet
2 participants
@Navis-Raven

Navis-Raven commented Apr 13, 2016

This a bug of display and very annoying, since even if a number (like serial number for example) is wirtten as a text into a table it is still displayed as a float,

Example:

'>>> ReportWorkbook = pe.get_book(file_name=rep_name)
'>>> ReportWorkbook['Infos']
Sheet Name: Infos
+-----------------+------------------+
| Date | 2016-03-31 10:59 |
+-----------------+------------------+
| Model name | XS360_EU |
+-----------------+------------------+
| Time by X | 60 |
+-----------------+------------------+
| Product A | 123 |
+-----------------+------------------+
| Product B | 4.567e+15 |
+-----------------+------------------+
| Bandwidth (UDP) | N/A |
+-----------------+------------------+
| Other infos | N/A |
+-----------------+------------------+
'>>> ReportWorkbook['Infos'].column[1]
['2016-03-31 10:59', 'XS360_EU', 60.0, '0123', '04566651561653122', 'N/A', 'N/A']

As seen above the serial numbers of products A and B are 0123 and 04566651561653122 are stocked as a text (string), however. Whend displayed with print they are displayed as a float so we se the first '0' disappear and 4.567e+15 insetead of 04566651561653122

@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

which file format is it? and is the file written by pyexcel?

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 13, 2016

Yes the file is written with pyexcel

The file format is ods

@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

are you using pyexcel-ods3 or pyexcel-ods?

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 13, 2016

pyexcel-ods3

@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

>>> import pyexcel,pyexcel.ext.ods3
>>> io=pyexcel.save_as(array=a, dest_file_type="ods")
>>> s=pyexcel.get_sheet(file_type='ods', file_stream=io)
>>> s
Sheet Name: pyexcel_sheet1
+-----+-------+
| 123 | 11234 |
+-----+-------+
>>> s.to_array() # can you do this and see if it is correct as you expect?
[[u'0123', u'011234']]
@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

In addition, you could do pip install pyexcel-text, which replaces text-table as presentation library

>>> import pyexcel.ext.text
>>> s
Sheet Name: pyexcel_sheet1
----  ------
0123  011234
----  ------
@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

Oh.. sorry, I have noticed that you have it displayed correctly but just the presentation got it wrong. I started to understand your problem now.

@chfw chfw added the bug label Apr 13, 2016

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 13, 2016

Initially we have this

>>> ReportWorkbook['Infos'].column[1]
['2016-03-31 10:59', 'XS360_EU', 60.0, '0123', '04566651561653122', 'N/A', 'N/A']

with '0123' and '04566651561653122' as a string !

But when we display it we get

+-----------------+------------------+
| Product A | 123 |
+-----------------+------------------+

No more 0

And

+-----------------+------------------+
| Product B | 4.567e+15 |
+-----------------+------------------+

We have 4.567e+15 instead of 04566651561653122, the print considers those strings as float

@chfw

This comment has been minimized.

Member

chfw commented Apr 13, 2016

reproduced.

>>> a=[['2016-03-31 10:59', 'XS360_EU', 60.0, '0123', '04566651561653122', 'N/A', 'N/A']]
>>> import pyexcel
>>> s=pyexcel.get_sheet(array=a)
>>> s
Sheet Name: pyexcel_sheet1
+------------------+----------+----+-----+-----------+-----+-----+
| 2016-03-31 10:59 | XS360_EU | 60 | 123 | 4.567e+15 | N/A | N/A |
+------------------+----------+----+-----+-----------+-----+-----+
@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 15, 2016

Hi, When do you think this bug will be solved ?

In the next release ?

@chfw

This comment has been minimized.

Member

chfw commented Apr 15, 2016

Will be in in v0.2.1 branch. I have the initial fix and will push it up tonight.

chfw added a commit that referenced this issue Apr 15, 2016

chfw added a commit that referenced this issue Apr 15, 2016

@chfw chfw closed this in f71643f Apr 15, 2016

@chfw

This comment has been minimized.

Member

chfw commented Apr 15, 2016

Could you please evaluate the fix?

pip3 install https://github.com/pyexcel/pyexcel/archive/v0.2.1.zip

@chfw chfw reopened this Apr 15, 2016

chfw added a commit that referenced this issue Apr 16, 2016

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 18, 2016

Hi,

this fix gave me this:


>>> import pyexcel
>>> import pyexcel as pe
>>> import pyexcel.ext.ods3
>>> from pyexcel_ods3 import get_data, save_data
>>> ReportWorkbook = pe.get_book(file_name=name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
---------------  -----------------
Date             2016-03-31 10:59
Model name       NS1800_EU
Time by att      60.0
Product A        0123
Product B        04566651561653122
Bandwidth (UDP)  N/A
Other infos      N/A
---------------  -----------------

instead of something like this :

>>> ReportWorkbook = pe.get_book(file_name=rep_name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
+-----------------+------------------+
| Date            | 2016-03-31 10:59 |
+-----------------+------------------+
| Model name      | XS360_EU         |
+-----------------+------------------+
| Time by X       | 60               |
+-----------------+------------------+
| Product A       | 0123             |
+-----------------+------------------+
| Product B       | 04566651561653122|
+-----------------+------------------+
| Bandwidth (UDP) | N/A              |
+-----------------+------------------+
| Other infos     | N/A              |
+-----------------+------------------+

Off course we don't have anymore the problem of string or float, but where is the table ? We don't have the table anymore. the table is printed like a text with tabulation, this is not what we've expected.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 18, 2016

To summuarize

This is what originally I have with pyexcel 0.2.0

>>> import pyexcel as pe
>>> import pyexcel.ext.ods3
>>> from pyexcel_ods3 import get_data, save_data
>>> ReportWorkbook = pe.get_book(file_name=name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
+-----------------+------------------+
| Date            | 2016-03-31 10:59 |
+-----------------+------------------+
| Model name      | NS1800_EU        |
+-----------------+------------------+
| Time by att     | 60               |
+-----------------+------------------+
| Product A       | 123              |
+-----------------+------------------+
| Product B       | 4.567e+15        |
+-----------------+------------------+
| Bandwidth (UDP) | N/A              |
+-----------------+------------------+
| Other infos     | N/A              |
+-----------------+------------------+
>>> 

And this is what is expected

>>> import pyexcel as pe
>>> import pyexcel.ext.ods3
>>> from pyexcel_ods3 import get_data, save_data
>>> ReportWorkbook = pe.get_book(file_name=name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
+-----------------+-------------------+
| Date            | 2016-03-31 10:59  |
+-----------------+-------------------+
| Model name      | NS1800_EU         |
+-----------------+-------------------+
| Time by att     | 60                |
+-----------------+-------------------+
| Product A       | 0123              |
+-----------------+-------------------+
| Product B       | 04566651561653122 |
+-----------------+-------------------+
| Bandwidth (UDP) | N/A               |
+-----------------+-------------------+
| Other infos     | N/A               |
+-----------------+-------------------+
>>> 

Because we have this:

>>> ReportWorkbook['Infos'].column[1]
['2016-03-31 10:59', 'XS360_EU', 60.0, '0123', '04566651561653122', 'N/A', 'N/A']

'0123' and '04566651561653122' are string and not float or int

however this is what pyexcel v0.2.1 (not released) give

>>> import pyexcel
>>> import pyexcel as pe
>>> import pyexcel.ext.ods3
>>> from pyexcel_ods3 import get_data, save_data
>>> ReportWorkbook = pe.get_book(file_name=name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
---------------  -----------------
Date             2016-03-31 10:59
Model name       NS1800_EU
Time by att      60.0
Product A        0123
Product B        04566651561653122
Bandwidth (UDP)  N/A
Other infos      N/A
---------------  -----------------

There is no more problem of float or string with '0123' and '04566651561653122' but there is no table

@chfw

This comment has been minimized.

Member

chfw commented Apr 18, 2016

In short, please do pip uninstall pyexcel-text. This is new thing in v0.2.1.

Long answer, v0.2.1 does auto import on pyexcel-text. The way you can manage it is to use pip to add/remove pyexcel-plugins(pyexcel/pyexcel-io#7). And the same thing will happen to pyexcel-ods3 in its version 0.2.0 release. The benefit is that no longer, a developer needs to do explicit import.

@chfw

This comment has been minimized.

Member

chfw commented Apr 18, 2016

And I am sorry that I should have informed you about the new feature in v0.2.1 3 days ago.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 18, 2016

After doing pip3 uninstall pyexcel-text

it gave me this:

>>> ReportWorkbook = pe.get_book(file_name=name)
>>> ReportWorkbook['Infos']
Sheet Name: Infos
+-----------------+-------------------+
| Date            | 2016-03-31 10:59  |
+-----------------+-------------------+
| Model name      | NS1800_EU         |
+-----------------+-------------------+
| Time by att     | 60.0              |
+-----------------+-------------------+
| Product A       | 0123              |
+-----------------+-------------------+
| Product B       | 04566651561653122 |
+-----------------+-------------------+
| Bandwidth (UDP) | N/A               |
+-----------------+-------------------+
| Other infos     | N/A               |
+-----------------+-------------------+

So I think that the bug is solved in this version.

However, I have a question, what if a user wants to use both of pyexcel-text and pyexcel-ods3 ?

it is more and more difficult to know which package to install, then uninstall

@chfw

This comment has been minimized.

Member

chfw commented Apr 18, 2016

pyexcel-text modifies the default string presentation of pyexcel sheet, which is an additional functionality that other pyexcel plugins do not do. It does become a problem when the user of pyexcel does not want presentation modification but still needs pyexcel-text for text file writing. Let me think about the solution.

@chfw

This comment has been minimized.

Member

chfw commented Apr 18, 2016

Will provide a solution for #31 before I release v0.2.1

@chfw chfw closed this Apr 18, 2016

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 19, 2016

I remarked something strange in this version 0.2.1.

When we put a number as an int, if we print the table, it displays the number as an int.

But, if we save the file then reopenning it, displays it as a float. It is strange. I'll open a new topic to explain in details.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

I did not try if there is the same bug with other libraries pyexcel-xls etc...

@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

with xls and xlsx, you will get float by default even though you have saved integer. with csv, you will get text(string). ods format, it is float but it seems there is a way around it.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

Now I tried with xls, it seems that the bug is fixed in the new version 0.2.1 for every file type

with xls and xlsx, you will get float by default even though you have saved integer. with csv, you will get text(string). ods format, it is float but it seems there is a way around it.

I'm not talking about this issue, but the bug described in this topic. I wanted to be sure that the bug has been fixed for every file type

@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

I see your point but xls uses xlrd, xlsx uses openxy and csv itself does not support data types other than string. So, these are outside the control of pyexcel unless pyexcel strives to rewrite those libraries.

However, with the constraints, pyexcel have the facilities to normalize the types after reading them back through the underneath libraries.

@chfw chfw reopened this Apr 21, 2016

@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

Here is a list of data types supported by xlrd, where integer isn't listed:

XL_CELL_EMPTY   0   empty string u''
XL_CELL_TEXT    1   a Unicode string
XL_CELL_NUMBER  2   float
XL_CELL_DATE    3   float
XL_CELL_BOOLEAN     4   int; 1 means TRUE, 0 means FALSE
XL_CELL_ERROR   5   int representing internal Excel codes; for a text representation, refer to the supplied dictionary error_text_from_code
XL_CELL_BLANK
@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

I'm not talking about integer here in this topic, but when a string is displayed as a float. Maybe you've mixed the two topics.

@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

Let me correct myself, with xlsx, you can save integer and get it back. for xls, you get float. csv, you get string. Here's what I have tried with(pyexcel=0.2.0 and other plugins were 0.1.x):

>>> import pyexcel,pyexce.ext.xlsx
>>> a=[[1,2],[1.1,2.2]]
>>> pyexcel.save_as(array=a, dest_file_name="test.xlsx")
>>> s=pyexcel.get_sheet(file_name="test.xlsx")
>>> s.to_array()
[[1, 2], [1.1, 2.2]]
>>> import pyexcel.ext.xls
>>> pyexcel.save_as(array=a, dest_file_name="test.xls")               
>>> s2=pyexcel.get_sheet(file_name="test.xls")
>>> s2.to_array()
[[1.0, 2.0], [1.1, 2.2]]
>>> pyexcel.save_as(array=a, dest_file_name="test.csv")
>>> s3=pyexcel.get_sheet(file_name="test.csv")
>>> s3.to_array()
[[u'1', u'2'], [u'1.1', u'2.2']]
@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

I see.. sorry about that...

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

lol

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

Here is the one talking about integers:

#32

@chfw

This comment has been minimized.

Member

chfw commented Apr 21, 2016

this bug(string as float) is with pyexcel not with pyexcel-xls. the fix I did was to tell texttable that all content are of string type and I used to convert every type to string and then ask texttable to display all.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 21, 2016

Ok, that sounds good, so you can close this topic again lol

@chfw chfw closed this Apr 21, 2016

@chfw

This comment has been minimized.

Member

chfw commented Apr 24, 2016

An upddate: the fix is released with pyexcel v0.2.1. And to avoid confusion, pyexcel-text 0.2.0 is to work with v0.2.1. Previous version won't work at all with pyexcel v0.2.1.

@Navis-Raven

This comment has been minimized.

Navis-Raven commented Apr 25, 2016

OK let's try it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment