Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

Commit

Permalink
Live Free or Doc Hard
Browse files Browse the repository at this point in the history
  • Loading branch information
ararslan committed Mar 9, 2017
1 parent f13c806 commit 6090f47
Show file tree
Hide file tree
Showing 6 changed files with 123 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
*.jl.cov
*.jl.*.cov
*.jl.mem
docs/build
docs/site
21 changes: 21 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
using DataArrays, Documenter

makedocs(
modules = [DataArrays],
clean = false,
format = :html,
sitename = "DataArrays.jl",
authors = "Simon Kornblith, John Myles White, and other contributors",
pages = [
"Home" => "index.md",
"Missing Data and Arrays" => "da.md",
"Utilities" => "util.md",
],
)

deploydocs(
repo = "github.com/JuliaStats/DataArrays.jl.git",
target = "build",
deps = nothing,
make = nothing,
)
40 changes: 40 additions & 0 deletions docs/src/da.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Representing missing data

```@meta
CurrentModule = DataArrays
```

```@docs
NA
NAtype
```

## Arrays with possibly missing data

```@docs
AbstractDataArray
AbstractDataVector
AbstractDataMatrix
DataArray
DataVector
DataMatrix
@data
isna
allna
anyna
dropna
levels
```

## Pooled arrays

```@docs
PooledDataArray
@pdata
compact
setlevels
setlevels!
replace!
PooledDataVecs
getpoolidx
```
26 changes: 26 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# DataArrays.jl

This package provides functionality for working with [missing data](https://en.wikipedia.org/wiki/Missing_data)
in Julia.
In particular, it provides the following:

* `NA`: A singleton representing a missing value
* `DataArray{T}`: An array type that can house both values of type `T` and missing values
* `PooledDataArray{T}`: An array type akin to `DataArray` but optimized for arrays with a smaller set of unique
values, as commonly occurs with categorical data

## Installation

This package is available for Julia versions 0.6 and up.
To install it, run

```julia
Pkg.add("DataArrays")
```

from the Julia REPL.

## Contents

```@contents
```
14 changes: 14 additions & 0 deletions docs/src/util.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Utility functions

```@meta
CurrentModule = DataArrays
```

```@docs
cut
gl
xtab
xtabs
reldiff
percent_change
```
17 changes: 17 additions & 0 deletions src/pooleddataarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,13 @@ Base.find(pdv::PooledDataVector{Bool}) = find(convert(Vector{Bool}, pdv, false))
##
##############################################################################

"""
getpoolidx(pda::PooledDataArray, val)
Return the index of the first occurrence of `val` in the value pool for `pda`.
If `val` is not already in the value pool, `pda` is modified to include it in
the pool.
"""
function getpoolidx{T,R}(pda::PooledDataArray{T,R}, val::Any)
val::T = convert(T,val)
pool_idx = findfirst(pda.pool, val)
Expand Down Expand Up @@ -587,6 +594,11 @@ end
##
##############################################################################

"""
replace!(x::PooledDataArray, from, to)
Replace all occurrences of `from` in `x` with `to`, modifying `x` in place.
"""
function replace!(x::PooledDataArray{NAtype}, fromval::NAtype, toval::NAtype)
NA # no-op to deal with warning
end
Expand Down Expand Up @@ -676,7 +688,12 @@ Perm{O<:Base.Sort.Ordering}(o::O, v::PooledDataVector) = FastPerm(o, v)
##
##############################################################################

"""
PooledDataVecs(v1, v2) -> (pda1, pda2)
Return a tuple of `PooledDataArray`s created from the data in `v1` and `v2`,
respectively, but sharing a common value pool.
"""
function PooledDataVecs{S,Q<:Integer,R<:Integer,N}(v1::PooledDataArray{S,Q,N},
v2::PooledDataArray{S,R,N})
pool = sort(unique([v1.pool; v2.pool]))
Expand Down

0 comments on commit 6090f47

Please sign in to comment.