Skip to content

Commit

Permalink
clean up source code (#2950)
Browse files Browse the repository at this point in the history
  • Loading branch information
bkamins committed Nov 26, 2021
1 parent 421db4d commit 2e18895
Show file tree
Hide file tree
Showing 41 changed files with 939 additions and 930 deletions.
10 changes: 5 additions & 5 deletions docs/src/man/comparisons.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ A sample data set can be created using the following code:
using DataFrames
using Statistics

df = DataFrame(grp = repeat(1:2, 3), x = 6:-1:1, y = 4:9, z = [3:7; missing], id = 'a':'f')
df2 = DataFrame(grp = [1, 3], w = [10, 11])
df = DataFrame(grp=repeat(1:2, 3), x=6:-1:1, y=4:9, z=[3:7; missing], id='a':'f')
df2 = DataFrame(grp=[1, 3], w=[10, 11])
```

!!! note

Some of the operations mutate the tables so every operation assumes that it is done on the original data frame.

Note that in the comparisons presented below predicates like `x -> x >= 1` can
Expand Down Expand Up @@ -173,7 +173,7 @@ This section includes more complex examples.
| Row-wise operation | `df.assign(x_y_min = df.apply(lambda v: min(v.x, v.y), axis=1))` | `transform(df, [:x, :y] => ByRow(min))` |
| | `df.assign(x_y_argmax = df.apply(lambda v: df.columns[v.argmax()], axis=1))` | `transform(df, AsTable([:x, :y]) => ByRow(argmax))` |
| DataFrame as input | `df.groupby('grp').head(2)` | `combine(d -> first(d, 2), groupby(df, :grp))` |
| DataFrame as output | `df[['x']].agg(lambda x: [min(x), max(x)])` | `combine(df, :x => (x -> (x = [minimum(x), maximum(x)],)) => AsTable)` |
| DataFrame as output | `df[['x']].agg(lambda x: [min(x), max(x)])` | `combine(df, :x => (x -> (x=[minimum(x), maximum(x)],)) => AsTable)` |

Note that pandas preserves the same row order after `groupby` whereas DataFrames.jl
shows them grouped by the provided keys after the `combine` operation,
Expand Down Expand Up @@ -249,7 +249,7 @@ The following table compares the main functions of DataFrames.jl with the R pack
library(data.table)
df <- data.table(grp = rep(1:2, 3), x = 6:1, y = 4:9,
z = c(3:7, NA), id = letters[1:6])
df2 <- data.table(grp=c(1,3), w = c(10,11))
df2 <- data.table(grp=c(1,3), w = c(10,11))
```

| Operation | data.table | DataFrames.jl |
Expand Down
5 changes: 2 additions & 3 deletions docs/src/man/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ each corresponding to a column or variable. The simplest way of constructing a
```jldoctest dataframe
julia> using DataFrames
julia> df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
julia> df = DataFrame(A=1:4, B=["M", "F", "F", "M"])
4×2 DataFrame
Row │ A B
│ Int64 String
Expand Down Expand Up @@ -221,7 +221,7 @@ data frame with two columns (note that the first column can only contain
integers and the second one can only contain strings):

```jldoctest dataframe
julia> df = DataFrame(A = Int[], B = String[])
julia> df = DataFrame(A=Int[], B=String[])
0×2 DataFrame
```

Expand Down Expand Up @@ -308,4 +308,3 @@ julia> Tables.rowtable(df)
(a = 1, b = 2)
(a = 3, b = 4)
```

2 changes: 1 addition & 1 deletion docs/src/man/importing_and_exporting.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ DataFrame(CSV.File(input))

A `DataFrame` can be written to a CSV file at path `output` using
```julia
df = DataFrame(x = 1, y = 2)
df = DataFrame(x=1, y=2)
CSV.write(output, df)
```

Expand Down
26 changes: 13 additions & 13 deletions docs/src/man/joins.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@ We often need to combine two or more data sets together to provide a complete pi
```jldoctest joins
julia> using DataFrames
julia> people = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
julia> people = DataFrame(ID=[20, 40], Name=["John Doe", "Jane Doe"])
2×2 DataFrame
Row │ ID Name
│ Int64 String
─────┼─────────────────
1 │ 20 John Doe
2 │ 40 Jane Doe
julia> jobs = DataFrame(ID = [20, 40], Job = ["Lawyer", "Doctor"])
julia> jobs = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
2×2 DataFrame
Row │ ID Job
│ Int64 String
Expand Down Expand Up @@ -55,7 +55,7 @@ See [the Wikipedia page on SQL joins](https://en.wikipedia.org/wiki/Join_(SQL))
Here are examples of different kinds of join:

```jldoctest joins
julia> jobs = DataFrame(ID = [20, 60], Job = ["Lawyer", "Astronaut"])
julia> jobs = DataFrame(ID=[20, 60], Job=["Lawyer", "Astronaut"])
2×2 DataFrame
Row │ ID Job
│ Int64 String
Expand Down Expand Up @@ -128,15 +128,15 @@ In order to join data frames on keys which have different names in the left and
you may pass `(left, right)` tuples or `left => right` pairs as `on` argument:

```jldoctest joins
julia> a = DataFrame(ID = [20, 40], Name = ["John Doe", "Jane Doe"])
julia> a = DataFrame(ID=[20, 40], Name=["John Doe", "Jane Doe"])
2×2 DataFrame
Row │ ID Name
│ Int64 String
─────┼─────────────────
1 │ 20 John Doe
2 │ 40 Jane Doe
julia> b = DataFrame(IDNew = [20, 40], Job = ["Lawyer", "Doctor"])
julia> b = DataFrame(IDNew=[20, 40], Job=["Lawyer", "Doctor"])
2×2 DataFrame
Row │ IDNew Job
│ Int64 String
Expand All @@ -156,9 +156,9 @@ julia> innerjoin(a, b, on = :ID => :IDNew)
Here is another example with multiple columns:

```jldoctest joins
julia> a = DataFrame(City = ["Amsterdam", "London", "London", "New York", "New York"],
Job = ["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
Category = [1, 2, 3, 4, 5])
julia> a = DataFrame(City=["Amsterdam", "London", "London", "New York", "New York"],
Job=["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
Category=[1, 2, 3, 4, 5])
5×3 DataFrame
Row │ City Job Category
│ String String Int64
Expand All @@ -169,9 +169,9 @@ julia> a = DataFrame(City = ["Amsterdam", "London", "London", "New York", "New Y
4 │ New York Doctor 4
5 │ New York Doctor 5
julia> b = DataFrame(Location = ["Amsterdam", "London", "London", "New York", "New York"],
Work = ["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
Name = ["a", "b", "c", "d", "e"])
julia> b = DataFrame(Location=["Amsterdam", "London", "London", "New York", "New York"],
Work=["Lawyer", "Lawyer", "Lawyer", "Doctor", "Doctor"],
Name=["a", "b", "c", "d", "e"])
5×3 DataFrame
Row │ Location Work Name
│ String String String
Expand Down Expand Up @@ -220,15 +220,15 @@ resulting data frame indicating whether the given row appeared only in the left,
the right or both data frames. Here is an example:

```jldoctest joins
julia> a = DataFrame(ID = [20, 40], Name = ["John", "Jane"])
julia> a = DataFrame(ID=[20, 40], Name=["John", "Jane"])
2×2 DataFrame
Row │ ID Name
│ Int64 String
─────┼───────────────
1 │ 20 John
2 │ 40 Jane
julia> b = DataFrame(ID = [20, 60], Job = ["Lawyer", "Doctor"])
julia> b = DataFrame(ID=[20, 60], Job=["Lawyer", "Doctor"])
2×2 DataFrame
Row │ ID Job
│ Int64 String
Expand Down
6 changes: 3 additions & 3 deletions docs/src/man/missing.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,9 @@ a new `DataFrame` or mutate the original in-place respectively.
```jldoctest missings
julia> using DataFrames
julia> df = DataFrame(i = 1:5,
x = [missing, 4, missing, 2, 1],
y = [missing, missing, "c", "d", "e"])
julia> df = DataFrame(i=1:5,
x=[missing, 4, missing, 2, 1],
y=[missing, missing, "c", "d", "e"])
5×3 DataFrame
Row │ i x y
│ Int64 Int64? String?
Expand Down
2 changes: 1 addition & 1 deletion docs/src/man/split_apply_combine.md
Original file line number Diff line number Diff line change
Expand Up @@ -405,7 +405,7 @@ Grouping a data frame using the `groupby` function can be seen as adding a looku
to it. Such lookups can be performed efficiently by indexing the resulting
`GroupedDataFrame` with a `Tuple` or `NamedTuple`:
```jldoctest sac
julia> df = DataFrame(g = repeat(1:1000, inner=5), x = 1:5000)
julia> df = DataFrame(g=repeat(1:1000, inner=5), x=1:5000)
5000×2 DataFrame
Row │ g x
│ Int64 Int64
Expand Down
18 changes: 9 additions & 9 deletions docs/src/man/working_with_dataframes.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ columns that fits on screen:
```jldoctest dataframe
julia> using DataFrames
julia> df = DataFrame(A = 1:2:1000, B = repeat(1:10, inner=50), C = 1:500)
julia> df = DataFrame(A=1:2:1000, B=repeat(1:10, inner=50), C=1:500)
500×3 DataFrame
Row │ A B C
│ Int64 Int64 Int64
Expand Down Expand Up @@ -72,8 +72,8 @@ its columns. For example in this case:
```jldoctest dataframe
julia> using CategoricalArrays
julia> DataFrame(a = 1:2, b = [1.0, missing],
c = categorical('a':'b'), d = [1//2, missing])
julia> DataFrame(a=1:2, b=[1.0, missing],
c=categorical('a':'b'), d=[1//2, missing])
2×4 DataFrame
Row │ a b c d
│ Int64 Float64? Cat… Rational…?
Expand Down Expand Up @@ -323,7 +323,7 @@ The indexing syntax can also be used to select rows based on conditions on
variables:

```jldoctest dataframe
julia> df = DataFrame(A = 1:2:1000, B = repeat(1:10, inner=50), C = 1:500)
julia> df = DataFrame(A=1:2:1000, B=repeat(1:10, inner=50), C=1:500)
500×3 DataFrame
Row │ A B C
│ Int64 Int64 Int64
Expand Down Expand Up @@ -731,7 +731,7 @@ The `describe` function returns a data frame summarizing the elementary
statistics and information about each column:

```jldoctest dataframe
julia> df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
julia> df = DataFrame(A=1:4, B=["M", "F", "F", "M"])
4×2 DataFrame
Row │ A B
│ Int64 String
Expand Down Expand Up @@ -777,7 +777,7 @@ We can also apply a function to each column of a `DataFrame` using `combine`.
For example:

```jldoctest dataframe
julia> df = DataFrame(A = 1:4, B = 4.0:-1.0:1.0)
julia> df = DataFrame(A=1:4, B=4.0:-1.0:1.0)
4×2 DataFrame
Row │ A B
│ Int64 Float64
Expand Down Expand Up @@ -811,7 +811,7 @@ Functions that transform a `DataFrame` to produce a
new `DataFrame` always perform a copy of the columns by default, for example:

```jldoctest dataframe
julia> df = DataFrame(A = 1:4, B = 4.0:-1.0:1.0)
julia> df = DataFrame(A=1:4, B=4.0:-1.0:1.0)
4×2 DataFrame
Row │ A B
│ Int64 Float64
Expand Down Expand Up @@ -933,8 +933,8 @@ Replacement operations affecting a single column can be performed using `replace
```jldoctest replace
julia> using DataFrames
julia> df = DataFrame(a = ["a", "None", "b", "None"], b = 1:4,
c = ["None", "j", "k", "h"], d = ["x", "y", "None", "z"])
julia> df = DataFrame(a=["a", "None", "b", "None"], b=1:4,
c=["None", "j", "k", "h"], d=["x", "y", "None", "z"])
4×4 DataFrame
Row │ a b c d
│ String Int64 String String
Expand Down
18 changes: 9 additions & 9 deletions src/abstractdataframe/abstractdataframe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ See also: [`rename`](@ref)
# Examples
```jldoctest
julia> df = DataFrame(i = 1, x = 2, y = 3)
julia> df = DataFrame(i=1, x=2, y=3)
1×3 DataFrame
Row │ i x y
│ Int64 Int64 Int64
Expand Down Expand Up @@ -276,7 +276,7 @@ See also: [`rename!`](@ref)
# Examples
```jldoctest
julia> df = DataFrame(i = 1, x = 2, y = 3)
julia> df = DataFrame(i=1, x=2, y=3)
1×3 DataFrame
Row │ i x y
│ Int64 Int64 Int64
Expand Down Expand Up @@ -1038,11 +1038,11 @@ julia> filter(row -> row.x > 1, df)
julia> filter(row -> row["x"] > 1, df)
2×2 DataFrame
Row │ x y
Row │ x y
│ Int64 String
─────┼───────────────
1 │ 3 b
2 │ 2 a
2 │ 2 a
julia> filter(:x => x -> x > 1, df)
2×2 DataFrame
Expand Down Expand Up @@ -1161,14 +1161,14 @@ julia> filter!(row -> row.x > 1, df)
─────┼───────────────
1 │ 3 b
2 │ 2 a
julia> filter!(row -> row["x"] > 1, df)
2×2 DataFrame
Row │ x y
Row │ x y
│ Int64 String
─────┼───────────────
1 │ 3 b
2 │ 2 a
2 │ 2 a
julia> filter!(:x => x -> x == 3, df)
1×2 DataFrame
Expand Down Expand Up @@ -2250,7 +2250,7 @@ Return a data frame containing the rows in `df` in reversed order.
```jldoctest
julia> df = DataFrame(a=1:5, b=6:10, c=11:15)
5×3 DataFrame
Row │ a b c
Row │ a b c
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 6 11
Expand All @@ -2271,4 +2271,4 @@ julia> reverse(df)
5 │ 1 6 11
```
"""
Base.reverse(df::AbstractDataFrame) = df[nrow(df):-1:1, :]
Base.reverse(df::AbstractDataFrame) = df[nrow(df):-1:1, :]
4 changes: 2 additions & 2 deletions src/abstractdataframe/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ Additionally selected MIME types support passing the following keyword arguments
# Examples
```jldoctest
julia> show(stdout, MIME("text/latex"), DataFrame(A = 1:3, B = ["x", "y", "z"]))
julia> show(stdout, MIME("text/latex"), DataFrame(A=1:3, B=["x", "y", "z"]))
\\begin{tabular}{r|cc}
\t& A & B\\\\
\t\\hline
Expand All @@ -118,7 +118,7 @@ julia> show(stdout, MIME("text/latex"), DataFrame(A = 1:3, B = ["x", "y", "z"]))
\\end{tabular}
14
julia> show(stdout, MIME("text/csv"), DataFrame(A = 1:3, B = ["x", "y", "z"]))
julia> show(stdout, MIME("text/csv"), DataFrame(A=1:3, B=["x", "y", "z"]))
"A","B"
1,"x"
2,"y"
Expand Down
20 changes: 10 additions & 10 deletions src/abstractdataframe/reshape.jl
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,11 @@ that return views into the original data frame.
# Examples
```jldoctest
julia> df = DataFrame(a = repeat(1:3, inner = 2),
b = repeat(1:2, inner = 3),
c = repeat(1:1, inner = 6),
d = repeat(1:6, inner = 1),
e = string.('a':'f'))
julia> df = DataFrame(a=repeat(1:3, inner=2),
b=repeat(1:2, inner=3),
c=repeat(1:1, inner=6),
d=repeat(1:6, inner=1),
e=string.('a':'f'))
6×5 DataFrame
Row │ a b c d e
│ Int64 Int64 Int64 Int64 String
Expand Down Expand Up @@ -237,11 +237,11 @@ Row and column keys will be ordered in the order of their first appearance.
# Examples
```jldoctest
julia> wide = DataFrame(id = 1:6,
a = repeat(1:3, inner = 2),
b = repeat(1.0:2.0, inner = 3),
c = repeat(1.0:1.0, inner = 6),
d = repeat(1.0:3.0, inner = 2))
julia> wide = DataFrame(id=1:6,
a=repeat(1:3, inner=2),
b=repeat(1.0:2.0, inner=3),
c=repeat(1.0:1.0, inner=6),
d=repeat(1.0:3.0, inner=2))
6×5 DataFrame
Row │ id a b c d
│ Int64 Int64 Float64 Float64 Float64
Expand Down
Loading

0 comments on commit 2e18895

Please sign in to comment.