We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
I noticed that size of a CSV file generated with Pandas and Python are differents depending of which version of Python was used.
I was expecting same behaviour for same code.
With Python 3
pc:~ scls$ cat big.py import pandas as pd import numpy as np (rows, cols) = (400, 10) a = np.random.random((rows, cols)) df = pd.DataFrame(a) filename = "big_random.csv" df.to_csv(filename, index=False) pc:~ scls$ python --version Python 3.4.3 :: Anaconda 2.3.0 (x86_64) pc:~ scls$ python big.py pc:~ scls$ head big_random.csv 0,1,2,3,4,5,6,7,8,9 0.7175194974125143,0.9868374217576047,0.032049602250014075,0.6289681136928122,0.8096042270600179,0.33685028982497345,0.15762455315620005,0.4691775579462879,0.4456870865050826,0.18795963399879667 0.6587841850716022,0.9138862585748279,0.725186213931711,0.8455808154946725,0.7513894749589103,0.1264561639354813,0.22313629403106283,0.7082854809639854,0.6372581410511284,0.39526526133363016 0.41655411171973145,0.82608272240786,0.39046502732419675,0.21280845299958473,0.7260928524192569,0.13413288736071716,0.6403422588148618,0.38493112678936114,0.1008469225955716,0.7569988810703301 0.4153526009936582,0.31647402611493414,0.9975731184442808,0.6426165566647016,0.09261643366931205,0.3227891182788275,0.8867457623057338,0.27223526407455145,0.3281299815210015,0.9740848774163636 0.18492378563510736,0.6467683901479606,0.040191223061303516,0.06418796210918698,0.6377758098323728,0.3015310590768058,0.35801398526272554,0.3847352145606483,0.5169639983061501,0.7688238573672432 0.12776779442246045,0.13988857304612567,0.5174730743084831,0.48860306709655155,0.6430744296754209,0.7043353997674583,0.9036918523659346,0.8363827082165963,0.10904005101984726,0.3467075055731551 0.8735436905183718,0.3094682378308442,0.3425056806446519,0.6327109907812603,0.027768508379761192,0.7572863534573687,0.013631783039836698,0.9498400284024592,0.7489006948603708,0.26146706653431384 0.00706906732485435,0.398808829510499,0.1603837067149072,0.1162434740119399,0.6308407696050173,0.38437501090290294,0.7084745025285255,0.6766732951295603,0.09640698119674629,0.16475759581133098 0.37288409337600503,0.8170980071434518,0.10346296752178363,0.22734655867481057,0.977310707692392,0.2058569589426188,0.810879704065204,0.4644448946189589,0.7872748134058031,0.21634203693609444
Let's do the same with Python 2
pc:~ scls$ source activate py2 discarding //anaconda/bin from PATH prepending //anaconda/envs/py2/bin to PATH (py2)pc:~ scls$ python --version Python 2.7.10 :: Anaconda 2.3.0 (x86_64) (py2)pc:~ scls$ python big.py (py2)pc:~ scls$ head big_random.csv 0,1,2,3,4,5,6,7,8,9 0.5579481683,0.684701543521,0.754306080917,0.618156128389,0.172254680145,0.0174204117472,0.42003733688,0.544810598703,0.501523693218,0.254650528482 0.245211610381,0.242803787702,0.74730831067,0.902427362626,0.79284128878,0.759901967668,0.138869495692,0.409657542539,0.800543764611,0.126875692556 0.157008551856,0.196911813758,0.427114483552,0.513200703916,0.629485103457,0.158393748929,0.725090100741,0.997671387723,0.168756770968,0.307894016467 0.277986851471,0.841819960853,0.948682092484,0.0698344807858,0.843959698756,0.124105138469,0.685600301284,0.638439389501,0.153843520073,0.00693283214343 0.825322391369,0.246830314636,0.76342798427,0.588335209531,0.0639153711562,0.277168287326,0.660799511539,0.246912047114,0.525794863223,0.606527113773 0.422893634037,0.416014910374,0.0282877421175,0.479474754244,0.562079226872,0.554424129574,0.850810096081,0.980219346119,0.376727776223,0.0202092423104 0.107718832593,0.82063197471,0.293988837033,0.0741333403483,0.223505401274,0.506775135928,0.411408416805,0.828313119764,0.670612028027,0.67312260052 0.822882425742,0.0355538636782,0.0453556725915,0.483123830922,0.726536606867,0.265317264415,0.190839972237,0.63416336544,0.776559958794,0.198684003523 0.0159240676555,0.082225869763,0.9188622672,0.628898793501,0.598847602455,0.479313877636,0.830676086143,0.930886044804,0.979980325282,0.42786165221
Compare number of digits of
0.7175194974125143
0.5579481683
Same code leads to 6 digits difference.
Kind regards
The text was updated successfully, but these errors were encountered:
In P3 str is the same as repr (more digits) for floats so that distinct numbers have different representations.
str
repr
float
Sorry, something went wrong.
further when u compare random numbers like this you should use np.random.seed to assure they see actually the same
https://docs.python.org/2/tutorial/floatingpoint.html
No branches or pull requests
Hello,
I noticed that size of a CSV file generated with Pandas and Python are differents depending of which version of Python was used.
I was expecting same behaviour for same code.
With Python 3
Let's do the same with Python 2
Compare number of digits of
0.7175194974125143
0.5579481683
Same code leads to 6 digits difference.
Kind regards
The text was updated successfully, but these errors were encountered: