Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/deploy-doc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Deploy Doc
on:
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Setup Python environment
uses: actions/setup-python@v2
with:
python-version: 3.7

- name: Check out source repository
uses: actions/checkout@v2
with:
ref: "release/0.1.0"
submodules: recursive

- name: Install TorchArrow
run: |
pip install --pre torcharrow -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the schedule make sure we can get the nightly build on the same day? ^_^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no guarantee --- depending on which one runs first. I'm actually thinking to change this one to manual run only, since it writes to the "release" folder and shouldn't be needed to run nightly. But in this case, we'll need to be careful which version of torcharrow build it installs. Here I'm using the nightly build because that's the one available, but ideally I should use a RC build, which is currently not uploaded anywhere.

On the other hand, we can probably leave a build and deploy doc section to one of the nightly builds, and make sure it uploads the doc to the "main" folder (like torchdata).

But yeah this still needs some tweaks.


- name: Build the docs
run: |
cd ./docs
pip install -r requirements.txt --user
make html
cd ..

- name: Deploy Docs on Push
uses: JamesIves/github-pages-deploy-action@v4.2.5
with:
token: ${{ secrets.GITHUB_TOKEN }}
branch: gh-pages # The branch the action should deploy to.
folder: docs/build/html # The folder the action should deploy.
target-folder: 0.1.0
15 changes: 0 additions & 15 deletions .github/workflows/ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,18 +77,3 @@ jobs:
- name : Install TorchArrow
run: |
CCACHE_DIR=$GITHUB_WORKSPACE/.ccache python setup.py install --user

- name: Build the docs
run: |
cd ./docs
pip3 install -r requirements.txt --user
make html
cd ..

- name: Deploy Docs on Push
if: ${{ github.event_name == 'push' }}
uses: JamesIves/github-pages-deploy-action@releases/v3
with:
ACCESS_TOKEN: ${{ secrets.GITHUB_TOKEN }}
BRANCH: gh-pages # The branch the action should deploy to.
FOLDER: docs/build/html # The folder the action should deploy.
70 changes: 37 additions & 33 deletions docs/source/column.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
torcharrow.Column
==========================

A :class:`torcharrow.Column` is a 1-dimension torch.Tensor like data structure containing
elements of a single data type. It also supports non-numeric types such as string,
A :class:`torcharrow.Column` is a 1-dimension torch.Tensor like data structure containing
elements of a single data type. It also supports non-numeric types such as string,
list, struct.

Data types
Expand Down Expand Up @@ -40,6 +40,7 @@ Column class reference
.. autosummary::
:toctree: generated
:nosignatures:
:template: class.rst
Copy link

@NivekT NivekT Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to check if the output page looks as you expect and edit the docstrings or template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep it looks the same to me with or without haha.


Column.head
Column.tail
Expand All @@ -63,7 +64,7 @@ Column class reference
Column.to_arrow
Column.to_tensor
Column.to_pylist
Column.to_pandas
Column.to_pandas
Copy link
Contributor

@wenleix wenleix Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate to this PR -- looks like the current Pandas conversion is not efficient (e.g. it gets the element into Python format, then convert into Pandas format...):

# default implementation, normally this should be zero copy...
self._prototype_support_warning("to_pandas")
return pd.Series(self)

Maybe we should use something like self.to_arrow().to_pandas() to fix it? :)



NumericalColumn class reference
Expand All @@ -73,6 +74,7 @@ NumericalColumn class reference
.. autosummary::
:toctree: generated
:nosignatures:
:template: class.rst

NumericalColumn.abs
NumericalColumn.ceil
Expand All @@ -96,42 +98,44 @@ StringColumn class reference
.. autosummary::
:toctree: generated
:nosignatures:
:template: class.rst

istring_column.StringMethods.length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a follow-up, maybe we should rename istring_column.py to string_column.py to get better doc :)

(btw, how does Pandas render doc or implement str_col.str.XXX...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

istring_column.StringMethods.slice
istring_column.StringMethods.split
istring_column.StringMethods.strip

istring_column.StringMethods.isalpha
istring_column.StringMethods.isnumeric
istring_column.StringMethods.isalnum
istring_column.StringMethods.isdigit
istring_column.StringMethods.isdecimal
istring_column.StringMethods.isspace
istring_column.StringMethods.islower
istring_column.StringMethods.isupper
istring_column.StringMethods.istitle

istring_column.StringMethods.lower
istring_column.StringMethods.upper

istring_column.StringMethods.startswith
istring_column.StringMethods.endswith
istring_column.StringMethods.count
istring_column.StringMethods.find
istring_column.StringMethods.replace
istring_column.StringMethods.match
istring_column.StringMethods.contains
istring_column.StringMethods.findall

torcharrow.istring_column.StringMethods.length
torcharrow.istring_column.StringMethods.slice
torcharrow.istring_column.StringMethods.split
torcharrow.istring_column.StringMethods.strip

torcharrow.istring_column.StringMethods.isalpha
torcharrow.istring_column.StringMethods.isnumeric
torcharrow.istring_column.StringMethods.isalnum
torcharrow.istring_column.StringMethods.isdigit
torcharrow.istring_column.StringMethods.isdecimal
torcharrow.istring_column.StringMethods.isspace
torcharrow.istring_column.StringMethods.islower
torcharrow.istring_column.StringMethods.isupper
torcharrow.istring_column.StringMethods.istitle

torcharrow.istring_column.StringMethods.lower
torcharrow.istring_column.StringMethods.upper

torcharrow.istring_column.StringMethods.startswith
torcharrow.istring_column.StringMethods.endswith
torcharrow.istring_column.StringMethods.count
torcharrow.istring_column.StringMethods.find
torcharrow.istring_column.StringMethods.replace
torcharrow.istring_column.StringMethods.match
torcharrow.istring_column.StringMethods.contains
torcharrow.istring_column.StringMethods.findall

ListColumn class reference
-----------------------------------
.. class:: ListColumn()

.. autosummary::
:toctree: generated
:nosignatures:
:template: class.rst

torcharrow.ilist_column.ListMethods.length
torcharrow.ilist_column.ListMethods.slice
torcharrow.ilist_column.ListMethods.vmap
ilist_column.ListMethods.length
ilist_column.ListMethods.slice
ilist_column.ListMethods.vmap
7 changes: 7 additions & 0 deletions docs/source/dataframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ DataFrame Class and General APIs
.. autosummary::
:toctree: generated
:nosignatures:
:template: class.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the class.rst and fucntion.rst from TorchData or some other domain library, but I actually don't know what style change they bring 😳

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me either ... but seems no harm to leave it here.


DataFrame.head
DataFrame.tail
Expand All @@ -42,6 +43,7 @@ Functional API
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.map
DataFrame.filter
Expand All @@ -54,6 +56,7 @@ Relational API
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.select
DataFrame.where
Expand All @@ -64,6 +67,7 @@ Data Cleaning
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.fill_null
DataFrame.drop_null
Expand All @@ -74,6 +78,7 @@ Conversions
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.to_arrow
DataFrame.to_tensor
Expand All @@ -85,6 +90,7 @@ Statistics
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.min
DataFrame.max
Expand All @@ -100,5 +106,6 @@ Arithmtic Operations
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

DataFrame.log
2 changes: 2 additions & 0 deletions docs/source/functional.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Recommendation Operations
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

bucketize
sigrid_hash
Expand All @@ -55,5 +56,6 @@ High-level Operations
.. autosummary::
:toctree: generated
:nosignatures:
:template: function.rst

scale_to_0_1
5 changes: 2 additions & 3 deletions docs/source/torcharrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ torcharrow
==========================

The torcharrow package contains data structures for two-dimensional, potentially heterogeneous tabular data,
denoted as dataframe.
It also defines relational operations over these dataframes.
denoted as dataframe.
It also defines relational operations over these dataframes.
Additionally, it provides utilities for conversion with other formats (especially zero-copy conversion with Arrow arrays),
and other useful utilities.

Expand All @@ -28,4 +28,3 @@ Mutating Ops

concat
if_else