npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

Panoplos opened this Issue Nov 8, 2012 · 6 comments

6 participants



  • Python: Version 3.3 ( release) on OS X Mountain Lion
  • numpy: Cloned from git master

When calling numpy.loadtxt on file containing strings as follows:

import numpy as np
datestxt = np.loadtxt("NYSE_dates.txt", dtype=str)

Where NYSE_dates.txt is simply a list of dates (could be anything really):


Output is:

["b'7/5/1962'" "b'7/6/1962'" "b'7/9/1962'" ..., "b'12/29/2020'"
 "b'12/30/2020'" "b'12/31/2020'"]

As you can see, all the strings have been bytes-casted, then stringified through conv, as you would get the same result from str(str('12/31/2020').encode('latin1')), per conv & compat.asbytes.

After looking at the code, it appears that all strings are bytes-casted with asbytes(...) pretty much throughout, as for example in split_line(...), so this must mean every routine in the module is broken.


I also have that issue. This is very very annoying; basically you can't use loadtxt in Python3.

Temporary solution: I removed all asbytes() calls in the loadtxt method.

NumPy member

Yeah, I remember thinking something was fishy in there when I looked through the code.


For the record, I am running into the same issue with datetime64 inputs, leading to a parsing error of the form: Error parsing datetime string "b'2013-01-02'". To work around this, I had to create a converter for that column:

def decoder(input_bytes):
    return input_bytes.decode("ascii")

This would be fine in production code but is highly non-pretty for training material...

@juliantaylor juliantaylor added this to the 1.10 blockers milestone Jul 30, 2014
@charris charris modified the milestone: 1.11 blockers, 1.10 blockers Jun 21, 2015
NumPy member

Pushing off to 1.11.


work-around - run iconv on the file first.

NumPy member

pushing off to 1.12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment