Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

ENH: JSON #3876

Merged
merged 4 commits into from Jun 13, 2013
View
@@ -954,13 +954,21 @@ with optional parameters:
- path_or_buf : the pathname or buffer to write the output
This can be ``None`` in which case a JSON string is returned
-- orient : The format of the JSON string, default is ``index`` for ``Series``, ``columns`` for ``DataFrame``
+- orient :
- * split : dict like {index -> [index], columns -> [columns], data -> [values]}
- * records : list like [{column -> value}, ... , {column -> value}]
- * index : dict like {index -> {column -> value}}
- * columns : dict like {column -> {index -> value}}
- * values : just the values array
+ Series :
+ default is 'index', allowed values are: {'split','records','index'}
+
+ DataFrame :
+ default is 'columns', allowed values are: {'split','records','index','columns','values'}
+
+ The format of the JSON string
+
+ * split : dict like {index -> [index], columns -> [columns], data -> [values]}
+ * records : list like [{column -> value}, ... , {column -> value}]
+ * index : dict like {index -> {column -> value}}
+ * columns : dict like {column -> {index -> value}}
+ * values : just the values array
- date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601), default is epoch
- double_precision : The number of decimal places to use when encoding floating point values, default 10.
@@ -989,6 +997,8 @@ Writing to a file, with a date index and a date column
dfj2 = dfj.copy()
dfj2['date'] = Timestamp('20130101')
+ dfj2['ints'] = range(5)
+ dfj2['bools'] = True
dfj2.index = date_range('20130101',periods=5)
dfj2.to_json('test.json')
open('test.json').read()
@@ -1005,31 +1015,86 @@ is ``None``. To explicity force ``Series`` parsing, pass ``typ=series``
is expected. For instance, a local file could be
file ://localhost/path/to/table.json
- typ : type of object to recover (series or frame), default 'frame'
-- orient : The format of the JSON string, one of the following
+- orient :
+
+ Series :
+ default is 'index', allowed values are: {'split','records','index'}
+
+ DataFrame :
+ default is 'columns', allowed values are: {'split','records','index','columns','values'}
+
+ The format of the JSON string
- * split : dict like {index -> [index], name -> name, data -> [values]}
- * records : list like [value, ... , value]
- * index : dict like {index -> value}
+ * split : dict like {index -> [index], columns -> [columns], data -> [values]}
+ * records : list like [{column -> value}, ... , {column -> value}]
+ * index : dict like {index -> {column -> value}}
+ * columns : dict like {column -> {index -> value}}
+ * values : just the values array
-- dtype : dtype of the resulting object
-- numpy : direct decoding to numpy arrays. default True but falls back to standard decoding if a problem occurs.
-- parse_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is False
+- dtype : if True, infer dtypes, if a dict of column to dtype, then use those, if False, then don't infer dtypes at all, default is True, apply only to the data
+- convert_axes : boolean, try to convert the axes to the proper dtypes, default is True
+- convert_dates : a list of columns to parse for dates; If True, then try to parse datelike columns, default is True
- keep_default_dates : boolean, default True. If parsing dates, then parse the default datelike columns
+- numpy: direct decoding to numpy arrays. default is False;
+ Note that the JSON ordering **MUST** be the same for each term if ``numpy=True``
The parser will raise one of ``ValueError/TypeError/AssertionError`` if the JSON is
not parsable.
+The default of ``convert_axes=True``, ``dtype=True``, and ``convert_dates=True`` will try to parse the axes, and all of the data
+into appropriate types, including dates. If you need to override specific dtypes, pass a dict to ``dtype``. ``convert_axes`` should only
+be set to ``False`` if you need to preserve string-like numbers (e.g. '1', '2') in an axes.
+
+.. warning::
+
+ When reading JSON data, automatic coercing into dtypes has some quirks:
+
+ * an index can be in a different order, that is the returned order is not guaranteed to be the same as before serialization
+ * a column that was ``float`` data can safely be converted to ``integer``, e.g. a column of ``1.``
+ * bool columns will be converted to ``integer`` on reconstruction
+
+ Thus there are times where you may want to specify specific dtypes via the ``dtype`` keyword argument.
+
Reading from a JSON string
.. ipython:: python
pd.read_json(json)
-Reading from a file, parsing dates
+Reading from a file
+
+.. ipython:: python
+
+ pd.read_json('test.json')
+
+Don't convert any data (but still convert axes and dates)
+
+.. ipython:: python
+
+ pd.read_json('test.json',dtype=object).dtypes
+
+Specify how I want to convert data
+
+.. ipython:: python
+
+ pd.read_json('test.json',dtype={'A' : 'float32', 'bools' : 'int8'}).dtypes
+
+I like my string indicies
.. ipython:: python
- pd.read_json('test.json',parse_dates=True)
+ si = DataFrame(np.zeros((4, 4)),
+ columns=range(4),
+ index=[str(i) for i in range(4)])
+ si
+ si.index
+ si.columns
+ json = si.to_json()
+
+ sij = pd.read_json(json,convert_axes=False)
+ sij
+ sij.index
+ sij.columns
.. ipython:: python
:suppress:
View
@@ -507,8 +507,15 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
----------
path_or_buf : the path or buffer to write the result string
if this is None, return a StringIO of the converted string
- orient : {'split', 'records', 'index', 'columns', 'values'},
- default is 'index' for Series, 'columns' for DataFrame
+ orient :
+
+ Series :
+ default is 'index'
+ allowed values are: {'split','records','index'}
+
+ DataFrame :
+ default is 'columns'
+ allowed values are: {'split','records','index','columns','values'}
The format of the JSON string
split : dict like
@@ -517,6 +524,7 @@ def to_json(self, path_or_buf=None, orient=None, date_format='epoch',
index : dict like {index -> {column -> value}}
columns : dict like {column -> {index -> value}}
values : just the values array
+
date_format : type of date conversion (epoch = epoch milliseconds, iso = ISO8601),
default is epoch
double_precision : The number of decimal places to use when encoding
Oops, something went wrong.