Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support date math with mutate2 #162

Open
tolot27 opened this issue Aug 27, 2021 · 8 comments
Open

support date math with mutate2 #162

tolot27 opened this issue Aug 27, 2021 · 8 comments

Comments

@tolot27
Copy link

tolot27 commented Aug 27, 2021

Currently, mutate2 accepts date strings and can do some math with it, i.e. subtracting seconds but it does not return date-formatted strings (see #159 (comment)).

Even if it can do some math if the expression contains a numeric value, it cannot compute the difference (for instance) between dates; although, it detects the date types:

echo -e "date\n2021-08-26" | csvtk mutate2 -e '"2021-08-28"-$date' -n res
[ERRO] Cannot transition token types from MODIFIER [-] to TIME [2021-08-26 00:00:00 +0200 CEST]

It can compare dates, already:

echo -e "date\n2021-08-26" | csvtk mutate2 -e '"2021-08-28"> $date' -n res
date,res
2021-08-26,true
@shenwei356
Copy link
Owner

I do some tests and believe the package doe not support date computation.

@tolot27
Copy link
Author

tolot27 commented Aug 28, 2021

According to https://github.com/Knetic/govaluate/blob/master/MANUAL.md#types it converts dates to float64 automatically. Hence, it should be able to do some math with it. And it works, indeed:

echo -e "date\n2021-08-26" | csvtk mutate2 -e '$date - 86400' -n res
date,res
2021-08-26,1629842400.00

But providing two dates does not seem to work because of some inconsistencies.

@san-r
Copy link

san-r commented Oct 19, 2021

So, we can operate on the basis of adding/subtracting 'seconds' from a date.

The above operation gave us 1629842400.00, which the Epoch Converter at https://www.epochconverter.com/ shows as 2021-08-24 in GMT.

The Excel formula to convert Unix timestamp to Date at https://exceljet.net/formula/convert-unix-time-stamp-to-excel-date also gives us 2021-08-24.

But, in the previous post one day was subtracted from 2021-08-26, which should be 2021-08-25. Then why is the Unix timestamp to date conversion of the result of operation showing 2021-08-24 ?

Being able to calculate the difference in days between two dates directly in csvtk would indeed be very useful.

@san-r
Copy link

san-r commented Oct 21, 2021

The following is done using miller's go port:

If we have a csv file test.txt as:

name,date1,date2
A,2020-09-20,2020-08-30
B,2020-01-31,2020-01-20

Then the following command:

mlr --c2p --from test_date.txt put "$date_diff = (strptime($date1, \"%Y-%m-%d\") - strptime($date2, \"%Y-%m-%d\"))/86400"

gives:

name date1      date2      date_diff
A    2020-09-20 2020-08-30 21
B    2020-01-31 2020-01-20 11

Something similar may be implemented for csvtk.

@shenwei356
Copy link
Owner

Sorry I can't make a promise I'll add this. Because mutate2 relies on a third-party package that has not been updated for > 4 years.

@san-r
Copy link

san-r commented Oct 22, 2021

I've noticed one custom function len() in csvtk. If another custom function DATEDIF() could be made available in csvtk it would be really nice since it is needed very frequently while analyzing data.

Usage: https://exceljet.net/excel-functions/excel-datedif-function

@shenwei356
Copy link
Owner

Let's leave this to v0.25.0, after finishing #172

@moorereason
Copy link
Contributor

@san-r,
#268 has been merged into master, so the Expr language is available for testing. Here's a way to replicate your miller solution:

cat dates.csv | csvtk mutate3 -n date_diff -e 'int( (date($date1) - date($date2)).Seconds() / 86400 )'

The date math returns a time.Duration from the underlying Go language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants