-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Closed
Labels
DependenciesRequired and optional dependenciesRequired and optional dependencies
Description
MWE
from __future__ import print_function
import pandas as pd
import numpy as np
print("Panda version:", pd.__version__)
print("+++++++++++++++++++++++++++++++++++")
print(pd.show_versions())
print("+++++++++++++++++++++++++++++++++++")
####################################################
# Config
####################################################
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
####################################################
# Read data
####################################################
file = "/tmp/california_housing_train.csv"
if(np.DataSource().exists(file)):
dataset = file
else:
dataset = "https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv"
sep=","
california_housing_dataframe = pd.read_csv(dataset, sep)
####################################################
# Reorder
####################################################
newOrder = np.random.permutation(california_housing_dataframe.index)
california_housing_dataframe_reordered = california_housing_dataframe.reindex(newOrder)
####################################################
# Merge and show diff of the heads
####################################################
# Let's take the heads of both datasetstand compare them
# They should be different in (mostly) all elements
head1 = california_housing_dataframe.head(10)
head2 = california_housing_dataframe_reordered.head(10)
# @see https://stackoverflow.com/a/36893675/605890
merged = head1.merge(head2, indicator=True, how='outer')
print(merged)
Run on colab
I created a colab for the MWE, which is based on pandas 0.22.0:
https://colab.research.google.com/drive/19uDE_H4AtpLaEL6INrRrDMXkdANsNr69#scrollTo=CzxuGppV26Rt
If you run it, you see at the output (if non is doubled randomly):
- 10x
left_only
- 10x
right_only
Run with docker containers
Now, run the same MWE (located under /tmp/tf/Bug.py
) in a two different docker containers, which uses pandas 0.23.4,:
Both return:
- 10x
both
This means, both heads are the same, which means that reindex
does not have any effect.
Python docker container (python 3.6.6)
docker run --rm -it -v /tmp/tf/:/tmp/ python:3.6.6 /bin/bash -c "pip install pandas && python /tmp/Bug.py"
tensorflow docker container (tensorsflow 1.11.0)
docker run --rm -it -v /tmp/tf/:/tmp/ tensorflow/tensorflow:1.11.0-py3 python /tmp/Bug.py
TLDR
The following code does not have any effect in pandas 0.23.4:
california_housing_dataframe_reordered = california_housing_dataframe.reindex(newOrder)
Metadata
Metadata
Assignees
Labels
DependenciesRequired and optional dependenciesRequired and optional dependencies