Skip to content

Commit

Permalink
Start implementing time64 type (#2921)
Browse files Browse the repository at this point in the history
This PR adds type time64, with basic properties such as converting to/from python and being able to display in the console / jupyter notebook.

WIP for #2911
  • Loading branch information
st-pasha committed Apr 2, 2021
1 parent f191b08 commit 8441bf6
Show file tree
Hide file tree
Showing 27 changed files with 646 additions and 38 deletions.
2 changes: 2 additions & 0 deletions docs/api/type.rst
Expand Up @@ -19,6 +19,7 @@
- :attr:`dt.Type.obj64`
- :attr:`dt.Type.str32`
- :attr:`dt.Type.str64`
- :attr:`dt.Type.time64`
- :attr:`dt.Type.void`


Expand Down Expand Up @@ -55,4 +56,5 @@
obj64 <type/obj64>
str32 <type/str32>
str64 <type/str64>
time64 <type/time64>
void <type/void>
26 changes: 26 additions & 0 deletions docs/api/type/time64.rst
@@ -0,0 +1,26 @@

.. xattr:: datatable.Type.time64
:src: --

.. x-version-added:: 1.0.0

The ``time64`` type is used to represent a specific moment in time. This
corresponds to ``datetime`` in Python, or ``timestamp`` in Arrow or pandas.
Internally, this type is stored as a 64-bit integer containing the number of
nanoseconds since the epoch (Jan 1, 1970) in UTC.

This type is not `leap-seconds`_ aware, meaning that it assumes that each day
has exactly 24×3600 seconds. In practice it means that calculating time
difference between two ``time64`` moments may be off by the number of leap
seconds that have occurred between them.

A ``time64`` column may also carry a time zone as meta information. This time
zone is used to convert the timestamp from the absolute UTC time to the local
calendar. For example, suppose you have two ``time64`` columns: one is in UTC
while the other is in *America/Los\_Angeles* time zone. Assume both columns
store the same value ``1577836800000``. Then these two columns represent the
same moment in time, however their calendar representations are different:
``2020-01-01T00:00:00Z`` and ``2019-12-31T16:00:00-0800`` respectively.


.. _`leap-seconds`: https://en.wikipedia.org/wiki/Leap_second
9 changes: 8 additions & 1 deletion docs/releases/v1.0.0.rst
Expand Up @@ -10,11 +10,18 @@
of previous stypes, and will eventually replace them.

-[new] New column type :attr:`dt.Type.date32` added, which can store a
calendar date [#1646]::
calendar date [#2858]::

>>> import datetime
>>> DT = dt.Frame([datetime.date(2021, 2, 17)])

-[new] New column type :attr:`dt.Type.time64` added, which cat store
timestamps within a certain time zone (in a single column all times
must be in the same time zone) [#2911]::

>>> import datetime
>>> DT = dt.Frame([datetime.datetime(2021, 3, 17, 9, 0, 0)])

-[new] A Frame can now be constructed from an Arrow
table::

Expand Down
1 change: 1 addition & 0 deletions src/core/_dt.h
Expand Up @@ -78,6 +78,7 @@ namespace py {
class obool;
class oby;
class odate;
class odatetime;
class odict;
class ofloat;
class oint;
Expand Down
6 changes: 6 additions & 0 deletions src/core/column.cc
Expand Up @@ -275,6 +275,11 @@ py::oobj Column::get_element_as_pyobject(size_t i) const {
bool isvalid = get_element(i, &x);
return isvalid? py::odate(x) : py::None();
}
case dt::SType::TIME64: {
int64_t x;
bool isvalid = get_element(i, &x);
return isvalid? py::odatetime(x) : py::None();
}
case dt::SType::OBJ: {
py::oobj x;
bool isvalid = get_element(i, &x);
Expand Down Expand Up @@ -303,6 +308,7 @@ bool Column::get_element_isvalid(size_t i) const {
int32_t x;
return get_element(i, &x);
}
case dt::SType::TIME64:
case dt::SType::INT64: {
int64_t x;
return get_element(i, &x);
Expand Down
6 changes: 4 additions & 2 deletions src/core/column/sentinel.cc
Expand Up @@ -32,11 +32,12 @@ Column Sentinel_ColumnImpl::make_column(size_t nrows, SType stype) {
case SType::BOOL: return Column(new SentinelBool_ColumnImpl(nrows));
case SType::INT8: return Column(new SentinelFw_ColumnImpl<int8_t>(nrows, stype));
case SType::INT16: return Column(new SentinelFw_ColumnImpl<int16_t>(nrows, stype));
case SType::DATE32:
case SType::INT32: return Column(new SentinelFw_ColumnImpl<int32_t>(nrows, stype));
case SType::TIME64:
case SType::INT64: return Column(new SentinelFw_ColumnImpl<int64_t>(nrows, stype));
case SType::FLOAT32: return Column(new SentinelFw_ColumnImpl<float>(nrows, stype));
case SType::FLOAT64: return Column(new SentinelFw_ColumnImpl<double>(nrows, stype));
case SType::DATE32: return Column(new SentinelFw_ColumnImpl<int32_t>(nrows, stype));
case SType::STR32: return Column(new SentinelStr_ColumnImpl<uint32_t>(nrows));
case SType::STR64: return Column(new SentinelStr_ColumnImpl<uint64_t>(nrows));
case SType::OBJ: return Column(new SentinelObj_ColumnImpl(nrows));
Expand All @@ -55,11 +56,12 @@ Column Sentinel_ColumnImpl::make_fw_column(
case SType::BOOL: return Column(new SentinelBool_ColumnImpl(nrows, std::move(buf)));
case SType::INT8: return Column(new SentinelFw_ColumnImpl<int8_t>(nrows, stype, std::move(buf)));
case SType::INT16: return Column(new SentinelFw_ColumnImpl<int16_t>(nrows, stype, std::move(buf)));
case SType::DATE32:
case SType::INT32: return Column(new SentinelFw_ColumnImpl<int32_t>(nrows, stype, std::move(buf)));
case SType::TIME64:
case SType::INT64: return Column(new SentinelFw_ColumnImpl<int64_t>(nrows, stype, std::move(buf)));
case SType::FLOAT32: return Column(new SentinelFw_ColumnImpl<float>(nrows, stype, std::move(buf)));
case SType::FLOAT64: return Column(new SentinelFw_ColumnImpl<double>(nrows, stype, std::move(buf)));
case SType::DATE32: return Column(new SentinelFw_ColumnImpl<int32_t>(nrows, stype, std::move(buf)));
case SType::OBJ: return Column(new SentinelObj_ColumnImpl(nrows, std::move(buf)));
default:
throw ValueError()
Expand Down
23 changes: 21 additions & 2 deletions src/core/column_from_python.cc
Expand Up @@ -314,6 +314,22 @@ static Column force_as_date32(const Column& inputcol) {



//------------------------------------------------------------------------------
// Time64
//------------------------------------------------------------------------------

static size_t parse_as_time64(const Column& inputcol, Buffer& mbuf, size_t i0) {
return parse_as_X<int64_t>(inputcol, mbuf, i0,
[](const py::oobj& item, int64_t* out) {
return item.parse_datetime(out) ||
item.parse_date(out) ||
item.parse_none(out);
});
}




//------------------------------------------------------------------------------
// String
//------------------------------------------------------------------------------
Expand Down Expand Up @@ -494,7 +510,7 @@ static const std::vector<dt::SType>& successors(dt::SType stype) {
static styvec s_void = {
dt::SType::BOOL, dt::SType::INT8, dt::SType::INT16, dt::SType::INT32,
dt::SType::INT64, dt::SType::FLOAT32, dt::SType::FLOAT64, dt::SType::STR32,
dt::SType::DATE32
dt::SType::DATE32, dt::SType::TIME64
};
static styvec s_bool8 = {dt::SType::INT8, dt::SType::INT16, dt::SType::INT32, dt::SType::INT64, dt::SType::FLOAT64, dt::SType::STR32};
static styvec s_int8 = {dt::SType::INT16, dt::SType::INT32, dt::SType::INT64, dt::SType::FLOAT64, dt::SType::STR32};
Expand All @@ -505,7 +521,8 @@ static const std::vector<dt::SType>& successors(dt::SType stype) {
static styvec s_float64 = {dt::SType::STR32};
static styvec s_str32 = {dt::SType::STR64};
static styvec s_str64 = {};
static styvec s_date32 = {};
static styvec s_date32 = {dt::SType::TIME64};
static styvec s_time64 = {};

switch (stype) {
case dt::SType::VOID: return s_void;
Expand All @@ -519,6 +536,7 @@ static const std::vector<dt::SType>& successors(dt::SType stype) {
case dt::SType::STR32: return s_str32;
case dt::SType::STR64: return s_str64;
case dt::SType::DATE32: return s_date32;
case dt::SType::TIME64: return s_time64;
default:
throw RuntimeError() << "Unknown successors of type " << stype; // LCOV_EXCL_LINE
}
Expand Down Expand Up @@ -551,6 +569,7 @@ static Column parse_column_auto_type(const Column& inputcol) {
case dt::SType::STR32: j = parse_as_str<uint32_t>(inputcol, databuf, strbuf); break;
case dt::SType::STR64: j = parse_as_str<uint64_t>(inputcol, databuf, strbuf); break;
case dt::SType::DATE32: j = parse_as_date32(inputcol, databuf, i); break;
case dt::SType::TIME64: j = parse_as_time64(inputcol, databuf, i); break;
default: continue; // try another stype
}
if (j != i) {
Expand Down
51 changes: 51 additions & 0 deletions src/core/csv/toa.cc
Expand Up @@ -21,6 +21,7 @@
//------------------------------------------------------------------------------
#include "csv/toa.h"
#include "lib/hh/date.h"
#include "utils/assert.h"



Expand Down Expand Up @@ -102,3 +103,53 @@ void date32_toa(char** pch, int32_t value) {
}
*pch = ch;
}


// Maximum space needed: 29 chars
// <date> : 10 chars
// T : 1
// <time> : 8 chars
// . : 1
// <ns> : 9
//
void time64_toa(char** pch, int64_t time) {
static constexpr int64_t NANOSECONDS_PER_SECOND = 1000000000LL;
static constexpr int64_t NANOSECONDS_PER_DAY = 24LL * 3600LL * 1000000000LL;

auto days = (time >= 0)? time / NANOSECONDS_PER_DAY
: (time + 1) / NANOSECONDS_PER_DAY - 1;
auto time_of_day = time - days * NANOSECONDS_PER_DAY;
xassert(time_of_day >= 0);
auto ns = time_of_day % NANOSECONDS_PER_SECOND;
time_of_day /= NANOSECONDS_PER_SECOND;
auto seconds = time_of_day % 60;
time_of_day /= 60;
auto minutes = time_of_day % 60;
time_of_day /= 60;
auto hours = time_of_day;

xassert(days < 110000 && days > -110000);
date32_toa(pch, static_cast<int>(days));
char* ch = *pch;
*ch++ = 'T';
*ch++ = static_cast<char>('0' + (hours / 10));
*ch++ = static_cast<char>('0' + (hours % 10));
*ch++ = ':';
*ch++ = static_cast<char>('0' + (minutes / 10));
*ch++ = static_cast<char>('0' + (minutes % 10));
*ch++ = ':';
*ch++ = static_cast<char>('0' + (seconds / 10));
*ch++ = static_cast<char>('0' + (seconds % 10));
if (ns) {
*ch++ = '.';
int64_t factor = NANOSECONDS_PER_SECOND / 10;
while (ns) {
auto digit = ns / factor;
xassert(digit < 10);
*ch++ = static_cast<char>('0' + digit);
ns -= digit * factor;
factor /= 10;
}
}
*pch = ch;
}
3 changes: 3 additions & 0 deletions src/core/csv/toa.h
Expand Up @@ -28,6 +28,7 @@
void int8_toa(char** pch, int8_t value);
void int16_toa(char** pch, int16_t value);
void date32_toa(char** pch, int32_t value);
void time64_toa(char** pch, int64_t value);


//---- Generic -----------------------------------------------------------------
Expand All @@ -41,4 +42,6 @@ template<> inline void toa(char** pch, int64_t x) { ltoa(pch, x); }
template<> inline void toa(char** pch, float x) { ftoa(pch, x); }
template<> inline void toa(char** pch, double x) { dtoa(pch, x); }



#endif
2 changes: 1 addition & 1 deletion src/core/datatablemodule.cc
Expand Up @@ -500,7 +500,7 @@ extern "C" {
py::ojoin::init(m);
py::osort::init(m);
py::oupdate::init(m);
py::odate::init();
py::datetime_init();

} catch (const std::exception& e) {
exception_to_python(e);
Expand Down
4 changes: 4 additions & 0 deletions src/core/frame/repr/html_styles.cc
Expand Up @@ -98,6 +98,7 @@ static py::oobj generate_stylesheet() {
" color: var(--jp-ui-font-color3);"
" font-size: 9px;"
"}\n"
".datatable .frame tbody td { text-align: left; }\n"
".datatable .frame tr.coltypes .row_index {"
" background: var(--jp-border-color0);"
"}\n"
Expand All @@ -113,6 +114,9 @@ static py::oobj generate_stylesheet() {
" color: var(--jp-cell-editor-border-color);"
" font-size: 80%;"
"}\n"
".datatable .sp {"
" opacity: 0.25;"
"}\n"
".datatable .footer { font-size: 9px; }\n"
".datatable .frame_dimensions {"
" background: var(--jp-border-color3);"
Expand Down
24 changes: 23 additions & 1 deletion src/core/frame/repr/html_widget.h
Expand Up @@ -181,9 +181,10 @@ class HtmlWidget : public dt::Widget {
case SType::STR32:
case SType::STR64: _render_str_value(col, i); break;
case SType::DATE32: _render_date_value(col, i); break;
case SType::TIME64: _render_time_value(col, i); break;
case SType::OBJ: _render_obj_value(col, i); break;
default:
html << "(unknown stype)";
html << "<span class=na>(unknown)</span>";
}
html << "</td>";
}
Expand Down Expand Up @@ -273,6 +274,27 @@ class HtmlWidget : public dt::Widget {
}
}

void _render_time_value(const Column& col, size_t row) {
static char out[30];
int64_t value;
bool isvalid = col.get_element(row, &value);
if (isvalid) {
char* ch = out;
time64_toa(&ch, value);
*ch = '\0';
if (out[10] == 'T') {
out[10] = '\0';
html << out;
html << "<span class=sp>T</span>";
html << out + 11;
} else {
html << out;
}
} else {
_render_na();
}
}

void _render_obj_value(const Column& col, size_t row) {
py::oobj val;
bool isvalid = col.get_element(row, &val);
Expand Down
26 changes: 23 additions & 3 deletions src/core/frame/repr/text_column.cc
Expand Up @@ -182,8 +182,27 @@ tstring Data_TextColumn::_render_value_date(const Column& col, size_t i) const {
if (isvalid) {
char* ch = tmp;
date32_toa(&ch, value);
*ch = '\0';
return tstring(std::string(tmp));
return tstring(std::string(tmp, static_cast<size_t>(ch - tmp)));
} else {
return na_value_;
}
}


tstring Data_TextColumn::_render_value_time(const Column& col, size_t i) const {
static char tmp[30];
int64_t value;
bool isvalid = col.get_element(i, &value);
if (isvalid) {
char* ch = tmp;
time64_toa(&ch, value);
xassert(ch > tmp + 10);
xassert(tmp[10] == 'T');
tstring out;
out << std::string(tmp, 10);
out << tstring("T", style::dim);
out << std::string(tmp + 11, static_cast<size_t>(ch - tmp - 11));
return out;
} else {
return na_value_;
}
Expand Down Expand Up @@ -328,7 +347,8 @@ tstring Data_TextColumn::_render_value(const Column& col, size_t i) const {
case SType::STR32:
case SType::STR64: return _render_value_string(col, i);
case SType::DATE32: return _render_value_date(col, i);
default: return tstring("");
case SType::TIME64: return _render_value_time(col, i);
default: return tstring("<unknown>", style::dim);
}
}

Expand Down
1 change: 1 addition & 0 deletions src/core/frame/repr/text_column.h
Expand Up @@ -115,6 +115,7 @@ class Data_TextColumn : public TextColumn {
tstring _render_value_bool(const Column&, size_t i) const;
tstring _render_value_string(const Column&, size_t i) const;
tstring _render_value_date(const Column&, size_t i) const;
tstring _render_value_time(const Column&, size_t i) const;

bool _needs_escaping(const CString&) const;
tstring _escape_string(const CString&) const;
Expand Down
1 change: 1 addition & 0 deletions src/core/python/_all.h
Expand Up @@ -23,6 +23,7 @@

#include "python/bool.h"
#include "python/date.h"
#include "python/datetime.h"
#include "python/dict.h"
#include "python/float.h"
#include "python/int.h"
Expand Down

0 comments on commit 8441bf6

Please sign in to comment.