Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lubridate commands #56

Closed
rleyvasal opened this issue Sep 15, 2021 · 14 comments
Closed

Lubridate commands #56

rleyvasal opened this issue Sep 15, 2021 · 14 comments
Labels
doc Improvements or additions to documentation enhancement New feature or request

Comments

@rleyvasal
Copy link
Collaborator

rleyvasal commented Sep 15, 2021

Hi @pwwang , do you have plans to add lubridate commands to datar?

I am trying to convert the Date column on stock time series data to date time with datar mutate.

Data from yahoo finance

import pandas as pd
from datar.all import *

aapl = pd.read_csv("AAPL.csv")

aapl.Date = pd.to_datetime(aapl.Date.astype('str')) # with pandas this works to change the data type to datetime

aapl = aapl >> mutate(Date = as_datetime(f.Date))  # this does not work and shows error message

aapl = aapl >> mutate(Date = as_date(f.Date))  #this does not work and does not show error message
@pwwang
Copy link
Owner

pwwang commented Sep 15, 2021

Have you tried this:

https://pwwang.github.io/datar/api/datar.base.date/#datar.base.date.as_date

lubridate is definitely useful to play with the datetimes. I didn't have a chance to dig into the details of how we should make it compatible between R datetimes, python datetimes, and the ones from pandas.

It's not in a recent plan. If we put it on the agenda, I guess an independent python package will be a better way to implement it.

But before that, I think we should have some temporary solutions for datetimes via datar and pandas API.

@rleyvasal
Copy link
Collaborator Author

I tried the following and it did not work.

aapl['Date'] = as_date(aapl.Date)

@pwwang pwwang added the doc Improvements or additions to documentation label Sep 15, 2021
@pwwang
Copy link
Owner

pwwang commented Sep 15, 2021

That's because as_date doesn't recognize your format.

Try this:

aapl['Date'] = as_date(aapl.Date, "%b %d, %Y")

The default formats as_date will try are:

"%Y-%m-%d",
"%Y/%m/%d",
"%Y-%m-%d %H:%M:%S",
"%Y/%m/%d %H:%M:%S",

@rleyvasal
Copy link
Collaborator Author

aapl['Date'] = as_date(aapl.Date, "%b %d, %Y") did not change the dtype to date

I also tried the following and also did not work.
aapl['Date'] = as_date(aapl.Date, "%Y-%m-%d")

@pwwang pwwang added the enhancement New feature or request label Sep 16, 2021
@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

That a good cache actually.

>>> df = tibble(d="Sep 16, 2021")
>>> df >> mutate(date=as_date(f.d, "%b %d, %Y"))
              d        date
       <object>    <object>
0  Sep 16, 2021  2021-09-16
>>> df2 = df >> mutate(date=as_date(f.d, "%b %d, %Y"))
>>> df2.date
0    2021-09-16
Name: date, dtype: object
>>> type(df2.date[0])
<class 'datetime.date'>
>>> df2["dd"] = pd.to_datetime(df2.date)
>>> df2
              d        date               dd
       <object>    <object> <datetime64[ns]>
0  Sep 16, 2021  2021-09-16       2021-09-16

Need to add one more layer to turn datetime.date to pandas datetime

pwwang added a commit that referenced this issue Sep 16, 2021
@pwwang pwwang mentioned this issue Sep 16, 2021
pwwang added a commit that referenced this issue Sep 16, 2021
* 📝 Add documentation for the "blind" environments (#45, #54, #55)

* 🩹 Fix trimws not importable from datar.all/datar.base

* ✨ Make as_date() return pd datetime types; Add as_pd_date() as an alias of pd.to_datetime() (#56)

* 🔖 0.5.1

* 🚨 Fix linting

* 👷 Deploy the docs on dev branch as well

* 💚 Fix docs deply in CI
@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

At v0.5.1, as_date() is now returning pandas datetime types, and the dtype is also pandas datetime.
An as_pd_date() function is also added, as an alias of pandas.to_datetime(), leveraging the datetime recognition in datar with the power of that pandas function.

Let me know if it solves your problem.

@rleyvasal
Copy link
Collaborator Author

@pwwang, thank you very much for the update

aapl.Date = as_date(aapl.Date) works

however, mutate() gives an error message.

aapl >> mutate(Date = as_date(f.Date))

AttributeError: REGULAR
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-d566a4235cfd> in <module>
----> 1 aapl  >> mutate(Date = as_date(f.Date))

D:\Anaconda3\lib\site-packages\pipda\function.py in _pipda_eval(self, data, context)
     94             # leave args/kwargs for the child
     95             # verb/function/operator to evaluate
---> 96             return func(*bondargs.args, **bondargs.kwargs)  # type: ignore
     97 
     98         args = evaluate_expr(

D:\Anaconda3\lib\functools.py in wrapper(*args, **kw)
    873                             '1 positional argument')
    874 
--> 875         return dispatch(args[0].__class__)(*args, **kw)
    876 
    877     funcname = getattr(func, '__name__', 'singledispatch function')

D:\Anaconda3\lib\site-packages\datar\dplyr\mutate.py in mutate(_data, _keep, _before, _after, base0_, *args, **kwargs)
     97     # out.columns.difference(removed)
     98     # changes column order when removed == []
---> 99     out = out[setdiff(out.columns, removed, __calling_env=CallingEnvs.REGULAR)]
    100     if _before is not None or _after is not None:
    101         new = setdiff(

D:\Anaconda3\lib\enum.py in __getattr__(cls, name)
    382             return cls._member_map_[name]
    383         except KeyError:
--> 384             raise AttributeError(name) from None
    385 
    386     def __getitem__(cls, name):

@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

You may have an older version of pipda. Try pip install -U pipda.

See what prints from datar.get_versions()

@rleyvasal
Copy link
Collaborator Author

This is the output:
pipda_version

@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

Here is mine:

python   : 3.9.5 (default, Jun  4 2021, 12:28:51) 
           [GCC 7.5.0]
datar    : 0.5.1
numpy    : 1.21.1
pandas   : 1.3.1
pipda    : 0.4.5
executing: 0.8.0
varname  : 0.8.1

You should definitely upgrade pipda.

@rleyvasal
Copy link
Collaborator Author

rleyvasal commented Sep 16, 2021

mutate() is working now after pip install -U pipda

shouldn't pipda be updated with pip install -U datar?

pipda_version_updated

@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

It really should.

It didn't because I used a very wild version specification for it pipda = "*", as well as varname. This means when pip sees it, any version would be fine, pip just won't upgrade it (pip uses "only necessary" strategy for upgrade). The reason I did that is that I am also the author of those packages, I kept maintaining and upgrading these packages. Each time I did that, I may need to release a version of datar to upgrade the dependencies (always eager to keep them up-to-date ...).

But yeah, I will try to add more specific versions for those dependencies, probably starting the next version, since those dependencies are pretty stable now.

@rleyvasal
Copy link
Collaborator Author

Thanks @pwwang!

@pwwang
Copy link
Owner

pwwang commented Sep 16, 2021

No problem. Closing this for now. Feel free to open new issues if you have other questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants