Skip to content

Commit

Permalink
cql: avoid undefined behavior in totimestamp() of extreme dates
Browse files Browse the repository at this point in the history
This patch fixes a UBSAN-reported integer overflow during one of our
existing tests,

   test_native_functions.py::test_mintimeuuid_extreme_from_totimestamp

when attempting to convert an extreme "date" value, millions of years
in the past, into a "timestamp" value. When UBSAN crashing is enabled,
this test crashes before this patch, and succeeds after this patch.

The "date" CQL type is 32-bit count of *days* from the epoch, which can
span 2^31 days (5 million years) before or after the epoch. Meanwhile,
the "timestamp" type measures the number of milliseconds from the same
epoch, in 64 bits. Luckily (or intentionally), every "date", however
extreme, can be converted into a "timestamp": This is because 2^31 days
is 1.85e17 milliseconds, well below timestamp's limit of 2^63 milliseconds
(9.2e18).

But it turns out that our conversion function, date_to_time_point(),
used some boost::gregorian library code, which carried out these
calculations in **microsecond** resolution. The extra conversion to
microseconds wasn't just wasteful, it also caused an integer overflow
in the extreme case: 2^31 days is 1.85e20 microseconds, which does NOT
fit in a 64-bit integer. UBSAN notices this overflow, and complains
(plus, the conversion is incorrect).

The fix is to do the trivial conversion on our own (a day is, by
convention, exactly 86400 seconds - no fancy library is needed),
without the grace of Boost. The result is simpler, faster, correct
for the Pliocene-age dates, and fixes the UBSAN crash in the test.

Fixes #17516

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #17527
  • Loading branch information
nyh authored and avikivity committed Feb 27, 2024
1 parent 0c37604 commit fc86174
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 6 deletions.
11 changes: 6 additions & 5 deletions cql3/functions/castas_fcts.cc
Expand Up @@ -11,6 +11,7 @@
#include "utils/UUID_gen.hh"
#include "cql3/functions/native_scalar_function.hh"
#include <boost/date_time/posix_time/posix_time.hpp>
#include <chrono>

namespace cql3 {
namespace functions {
Expand Down Expand Up @@ -133,11 +134,11 @@ simple_date_native_type time_point_to_date(const db_clock::time_point& tp) {
}

db_clock::time_point date_to_time_point(const uint32_t date) {
const auto epoch = boost::posix_time::from_time_t(0);
const auto target_date = epoch + boost::gregorian::days(int64_t(date) - (1UL<<31));
boost::posix_time::time_duration duration = target_date - epoch;
const auto millis = std::chrono::milliseconds(duration.total_milliseconds());
return db_clock::time_point(std::chrono::duration_cast<db_clock::duration>(millis));
// "date" counts the number of days since the epoch, where the middle
// of the unsigned range, 2^31, signifies the epoch itself.
int64_t millis_since_epoch = (int64_t(date) - (1UL<<31)) * 24 * 60 * 60 * 1000;
return db_clock::time_point(std::chrono::duration_cast<db_clock::duration>(
std::chrono::milliseconds(millis_since_epoch)));
}

static data_value castas_fctn_from_timestamp_to_date(data_value from) {
Expand Down
42 changes: 41 additions & 1 deletion test/cql-pytest/test_native_functions.py
Expand Up @@ -11,11 +11,12 @@

import pytest
from util import new_test_table, unique_key_int
from datetime import datetime
from cassandra.protocol import InvalidRequest

@pytest.fixture(scope="module")
def table1(cql, test_keyspace):
with new_test_table(cql, test_keyspace, "p int, i int, g bigint, b blob, s text, t timestamp, u timeuuid, PRIMARY KEY (p)") as table:
with new_test_table(cql, test_keyspace, "p int, i int, g bigint, b blob, s text, t timestamp, u timeuuid, d date, PRIMARY KEY (p)") as table:
yield table

# Check that a function that can take a column name as a parameter, can also
Expand Down Expand Up @@ -121,3 +122,42 @@ def test_mintimeuuid_extreme_from_totimestamp(cql, table1):
cql.execute(f"SELECT * FROM {table1} WHERE p={p} and u < mintimeuuid(totimestamp(123)) ALLOW FILTERING")
except:
pass

# According to the documentation, the toTimestamp() function can take either
# a "timeuuid" or a "date" value and convert it to a "timestamp" type
# (64-bit signed integer representing number of milliseconds since the UNIX
# epoch). Let's test these conversions, and their error cases (especially,
# the range dates covered by the different types isn't identical).
# In test_type_date.py we have test coverage on the different ways a "date"
# value can initialized, so we don't need to cover all of these here.
def test_totimestamp_date(cql, table1):
p = unique_key_int()
# The date 2**31 is the day of the epoch, so 2**31+1 is one day later,
# Midnight January 2nd, 1970:
cql.execute(f"INSERT INTO {table1} (p, d) VALUES ({p}, {2**31+1})")
assert [(datetime(1970,1,2,0,0),)] == list(cql.execute(f"SELECT totimestamp(d) FROM {table1} WHERE p={p}"))
# The day 2**31-1 is one day before the epoch, and this (negative
# timestamps) is allowed:
cql.execute(f"INSERT INTO {table1} (p, d) VALUES ({p}, {2**31-1})")
assert [(datetime(1969,12,31,0,0),)] == list(cql.execute(f"SELECT totimestamp(d) FROM {table1} WHERE p={p}"))

# Same as above test test_totimestame_date(), but use extreme dates
# millions of years in the past. These tests cannot be run with the
# current Python driver because of bugs it has in converting extreme
# timestamps - https://github.com/scylladb/python-driver/issues/255 -
# so the test is skipped.
@pytest.mark.skip(reason="Python driver bug")
def test_totimestamp_date_extreme(cql, table1):
p = unique_key_int()
# The day 2**31-2**29 is 2**29 days before the epoch - it's a useless date
# (more than a million years before our time), but at 4e16 milliseconds
# before the epoch, it should still fit nicely into 63 bits (9e18
# milliseconds), and should work fine.
cql.execute(f"INSERT INTO {table1} (p, d) VALUES ({p}, {2**31-2**29})")
cql.execute(f"SELECT totimestamp(d) FROM {table1} WHERE p={p}")
# The day 2**30 is 2**30 days before the epoch - it's a useless date
# (almost 3 million years before our time), but at 10^17 milliseconds
# before the epoch, it should still fit 63 bits (10^19 milliseconds)
# and work fine.
cql.execute(f"INSERT INTO {table1} (p, d) VALUES ({p}, {2**30})")
cql.execute(f"SELECT totimestamp(d) FROM {table1} WHERE p={p}")

0 comments on commit fc86174

Please sign in to comment.