Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for 1.6 release #3352

Merged
merged 5 commits into from Jul 9, 2023
Merged

Prepare for 1.6 release #3352

merged 5 commits into from Jul 9, 2023

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Jun 30, 2023

@timholy - I have the following problem when preparing for 1.6 release of DataFrames.jl on Julia 1.9.1.

When I do @time using DataFrames from this branch in a fresh Julia session, I get the following:

julia> @time using DataFrames
  1.763248 seconds (1.94 M allocations: 123.502 MiB, 2.65% gc time, 28.95% compilation time: 99% of which was recompilation)

My question is how I can track down what causes recompilation (we discussed it some time ago, I think, but I wanted to make sure what is the current best practice). Thank you!

CC @nalimilan

@bkamins bkamins added the ecosystem Issues in DataFrames.jl ecosystem label Jun 30, 2023
@bkamins bkamins added this to the 1.6 milestone Jun 30, 2023
@bkamins
Copy link
Member Author

bkamins commented Jul 2, 2023

OK. I think I have tracked it down. In the terminal, things look as follows:

julia> @time using DataFrames
  1.288108 seconds (1.23 M allocations: 76.504 MiB, 6.62% gc time, 0.31% compilation time)

and this is OK.

The problem is that I run previous tests under VSCode.

For reference, the instructions to check these things are here #3248.

And the output is (under VSCode):

julia> staletrees = precompile_blockers(trees, tinf)
1-element Vector{SnoopCompile.StaleTree}:
 inserting unwrapcontext(io::VSCodeServer.IJuliaCore.IJuliaStdio) @ VSCodeServer.IJuliaCore ~\.vscode\extensions\julialang.language-julia-1.47.2\scripts\packages\IJuliaCore\src\stdio.jl:24 invalidated:
   backedges: 1: MethodInstance for Base.unwrapcontext(::IO) at depth 1 with 2 children blocked InferenceTimingNode: 0.000054/0.000054 on IOContext(::IO) with 0 direct children
              2: MethodInstance for Base.unwrapcontext(::IO) at depth 1 with 2 children blocked InferenceTimingNode: 0.000054/0.000054 on IOContext(::IO) with 0 direct children
              3: MethodInstance for Base.unwrapcontext(::IO) at depth 1 with 9 children blocked InferenceTimingNode: 0.000132/0.000235 on IOContext(::IO, ::Pair{Symbol, Vector{Any}}) with 4 direct children
              4: MethodInstance for Base.unwrapcontext(::IO) at depth 1 with 11 children blocked InferenceTimingNode: 0.000132/0.000235 on IOContext(::IO, ::Pair{Symbol, Vector{Any}}) with 4 direct children

So the question is - how can we make sure that DataFrames.jl cleanly loads under VSCode? Any tips would be welcome.

CC @davidanthoff @pfitzseb (as maybe you saw something similar previously with other packages)
@ronisbr (as maybe the same issue is with other IO related things)

@pfitzseb
Copy link
Contributor

pfitzseb commented Jul 3, 2023

I'm not sure if there's anything to do here. We're defining a new AbstractPipe in the extension with

struct IJuliaStdio{IO_t <: IO,F} <: Base.AbstractPipe
    io::IOContext{IO_t}
    send_callback::F
end
# ...
Base.unwrapcontext(io::IJuliaStdio) = Base.unwrapcontext(io.io)

which seems entirely correct (not quite harmless though, apparently).

I can look into only loading the IJuliaCore package when necessary though (i.e. only for notebooks).

@davidanthoff
Copy link
Contributor

davidanthoff commented Jul 3, 2023

I can look into only loading the IJuliaCore package when necessary though (i.e. only for notebooks).

Would that then cause the precompile cache for DataFrames to get invalidated whenever one switches between the VS Code Julia REPL and notebook in VS Code? Or are we already in a really bad situation there because it also gets invalidated if one switches just generally between a plain REPL and the VS Code Julia REPL? Or is this just not an issue at all?

@bkamins
Copy link
Member Author

bkamins commented Jul 3, 2023

@davidanthoff The issue is between VSCode Julia REPL and even plain REPL in VSCode:

Julia REPL in VSCode (the one integrated with the editor for code passing)

julia> @time using DataFrames
  3.553182 seconds (1.94 M allocations: 123.335 MiB, 4.26% gc time, 28.55% compilation time: 99% of which was recompilation)

plain REPL in VSCode (started julia process manually):

julia> @time using DataFrames
  2.570066 seconds (1.43 M allocations: 90.104 MiB, 5.86% gc time, 0.34% compilation time)

(and do not look at absolute time as it will vary machine by machine - now I am on a very weak computer but what percentage of load time is compilation time and how much of it is recompilation)


I also suspect that this issue is not only DataFrames.jl specific. Here is an example with CSV.jl (even worse as 60% of time is spent in compilation under Julia REPL)

Julia REPL in VSCode (the one integrated with the editor for code passing)

julia> @time using CSV
  1.608312 seconds (926.79 k allocations: 62.116 MiB, 60.53% compilation time: 99% of which was recompilation)

plain REPL in VSCode (started julia process manually):

julia> @time using CSV
  1.224910 seconds (421.61 k allocations: 28.538 MiB, 0.59% compilation time)

@davidanthoff
Copy link
Contributor

So on my system it is actually consistently faster to run things from the integrated REPL in VS Code...

From the VS Code Julia REPL:

julia> @time using DataFrames
  1.384432 seconds (1.49 M allocations: 91.560 MiB, 3.40% gc time, 2.30% compilation time: 75% of which was recompilation)

From a normal REPL:

julia> @time using DataFrames
  1.735041 seconds (1.43 M allocations: 89.874 MiB, 5.62% gc time, 0.27% compilation time)

I tried this a couple of times, and I always get results roughly similar to that. Does that make any sense??

These runs are on Julia 1.9.1 on Windows, with project that only has DataFrames in it, on the branch of this PR here. And these timings are done after things were precompiled in a separate Julia session.


My worry was a different thing though, namely that different things would be cached depending on whether the precompile is triggered from a normal REPL or the VS Code integrated REPL. And that one would then end up overwriting the cache files every time one switches between the VS Code and standard REPL. But at least from a cursory trial I don't think that is actually happening, so that was probably a false worry.

@pfitzseb
Copy link
Contributor

pfitzseb commented Jul 3, 2023

different things would be cached depending on whether the precompile is triggered from a normal REPL or the VS Code integrated REPL

Yeah, Julia can't cache methods from non-dependency packages, so we don't need to worry about that. I didn't time load time, but can confirm the invalidations.

@bkamins
Copy link
Member Author

bkamins commented Jul 4, 2023

Does that make any sense??

For me, it is strange that you consistently see shorter timings in VS Code Julia REPL, as they do more allocations, allocate more, and trigger recompilation. Maybe there is some other setting (e.g. like number of threads used - if it affects anything here) that could influence this.

@davidanthoff
Copy link
Contributor

I just tried again, and double checked any VS Code specific settings, but there aren't any, as far as I can tell... I also don't have a startup.jl and I also made sure that there is nothing in the global v1.9 project. And still get consistently faster load times in the integrated REPL. Maybe the VS Code extension stuff is loading something that is also needed by DataFrames later on, and that explains the speedup? But then it is weird that @bkamins isn't seeing that, right?

But maybe that is a side discussion. For the original specific question the only thing we could consider is a conditional exclude of the IJuliaCore stuff in VSCodeServer, I guess? But that might not be so easy, I assume that would then somehow mess with the precompile cache for VSCodeServer. But in any case, we can sort that out over in the julia-vscode repo.

@bkamins
Copy link
Member Author

bkamins commented Jul 4, 2023

@pszufe @lkrain - can you please run these tests on your side and report what you get? If it is not clear what we compare please let me know.

@pszufe
Copy link

pszufe commented Jul 5, 2023

I have run such tests on Julia 1.9.1 Windows and I see a similar difference in loading times (yet in my case number of allocations when running @time using DataFrames is almost identical.

However, IMO, the differences in loading times between VSCode and REPL can be easily explained. The REPL inside VS code starts with having the package VSCodeServer loaded. This package has a subpackage VSCodeServer.JuliaInterpreter and JuliaInterpreter has DataFrames among its dependencies.

JuliaInterpreter also depends eg. on HTTP.jl and I culd observed different behavior when loading HTTP in REPL vs the VS Code interpreter.

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.1 (2023-06-07)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(@v1.9) pkg> st DataFrames
Status `C:\JuliaPkg\Julia-1.9.1\environments\v1.9\Project.toml`
  [a93c6f00] DataFrames v1.6.0 `https://github.com/JuliaData/DataFrames.jl.git#bk/release16`

julia> @time using DataFrames
  2.167068 seconds (1.45 M allocations: 89.419 MiB, 6.53% gc time, 0.47% compilation time)

julia> @time using HTTP
  0.293497 seconds (90.24 k allocations: 5.964 MiB, 3.16% compilation time)

and VS Code

julia> @time using DataFrames
  1.737406 seconds (1.51 M allocations: 91.506 MiB, 4.06% gc time, 2.52% compilation time: 70% of which was recompilation)

julia> @time using HTTP
  0.262854 seconds (105.21 k allocations: 6.812 MiB)

@bkamins
Copy link
Member Author

bkamins commented Jul 5, 2023

@pszufe - this is interesting and surprising, as DataFrames.jl and HTTP.jl are only [extras] of JuliaInterpreter.jl.

@KristofferC - I would expect that things in [extras] should not affect precompilation?

@KristofferC
Copy link
Contributor

KristofferC commented Jul 5, 2023

If I print out what gets compiled I see

julia> using DataFrames
precompile(Tuple{typeof(Base.first), Array{Any, 1}})
precompile(Tuple{typeof(Revise.watch_package), Base.PkgId})
precompile(Tuple{typeof(VSCodeServer.getvariables), Bool})
precompile(Tuple{typeof(TableTraits.isiterabletable), Module})
precompile(Tuple{typeof(VSCodeServer.treerender), Nothing})
precompile(Tuple{typeof(TableTraits.isiterabletable), Nothing})

So something seems to invalidate watch_package that gets precompiled (I've seen this happen in before). That seems not directly related to DataFrames though. Indeed, in a REPL:

julia> using Revise

julia> using DataFrames
precompile(Tuple{typeof(Base.isempty), Base.Set{Tuple{Revise.PkgData, String}}})
precompile(Tuple{typeof(Revise.iswritable), String})
precompile(Tuple{typeof(Base.first), Array{Any, 1}})
precompile(Tuple{typeof(Revise.watch_package), Base.PkgId})

So the difference in VSCode and REPL is that Revise is loaded in VSCode.

@davidanthoff
Copy link
Contributor

JuliaInterpreter has DataFrames among its dependencies

JuliaInterpreter also depends eg. on HTTP.jl

No, that is not so :) Both of these are test only dependencies and will definitely not be loaded by default in the Julia VS Code REPL.

So the difference in VSCode and REPL is that Revise is loaded in VSCode.

I get the times that I reported even when I completely remove Revise from any project that might be loaded, so I don't think that can explain it.

@bkamins
Copy link
Member Author

bkamins commented Jul 5, 2023

I get the times that I reported even when I completely remove Revise

My timings are without Revise also.

Both of these are test only dependencies and will definitely not be loaded by default in the Julia VS Code REPL.

Yes, but what @pszufe suggests (and this might be the case) is that "they should not be loaded" but maybe some signatures from them get loaded although they should not (I agree this is unlikely so this was just his working hypothesis).

@KristofferC
Copy link
Contributor

I get the times that I reported even when I completely remove Revise from any project that might be loaded, so I don't think that can explain it.

Can you run the VSCode REPL with --trace-compile=stderr, write some random stuff in the REPL to get rid of all the noise, then load DataFrames and see what is printed out.

@bkamins
Copy link
Member Author

bkamins commented Jul 5, 2023

Stand alone Julia session:

precompile(Tuple{typeof(Base.check_open), Base.TTY})
julia> using DataFrames
precompile(Tuple{typeof(Base.first), Array{Any, 1}})

VSCode Julia REPL (I leave out a lot of stuff before using DataFrames:

julia> using DataFrames
precompile(Tuple{VSCodeServer.var"#102#104"{REPL.LineEditREPL, REPL.LineEdit.Prompt}, Expr})
precompile(Tuple{typeof(Base.indexed_iterate), Tuple{Pair{String, String}, Int64}, Int64})
precompile(Tuple{typeof(Base.indexed_iterate), Tuple{Pair{String, String}, Int64}, Int64, Int64})
precompile(Tuple{typeof(VSCodeServer.evalrepl), Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt})
precompile(Tuple{typeof(VSCodeServer.JSONRPC.send_notification), VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, String, Nothing})
precompile(Tuple{typeof(VSCodeServer.JSON.Writer.print), Base.GenericIOBuffer{Array{UInt8, 1}}, Base.Dict{String, Union{Nothing, String}}})
precompile(Tuple{typeof(VSCodeServer.run_with_backend), Function})
precompile(Tuple{typeof(Base.put!), Base.Channel{Any}, Tuple{VSCodeServer.var"#106#108"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt}, Tuple{}}})
precompile(Tuple{typeof(Base.indexed_iterate), Tuple{VSCodeServer.var"#106#108"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt}, Tuple{}}, Int64})
precompile(Tuple{typeof(Base.indexed_iterate), Tuple{VSCodeServer.var"#106#108"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt}, Tuple{}}, Int64, Int64})
precompile(Tuple{typeof(Base.invokelatest), Any})
precompile(Tuple{Type{NamedTuple{(:is_repl,), T} where T<:Tuple}, Tuple{Bool}})
precompile(Tuple{Type{VSCodeServer.InlineDisplay}, Bool})
precompile(Tuple{typeof(Base.convert), Type{Base.Multimedia.AbstractDisplay}, VSCodeServer.InlineDisplay})
precompile(Tuple{VSCodeServer.var"#106#108"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt}})
precompile(Tuple{typeof(Base.require), Module, Symbol})
precompile(Tuple{typeof(Base._require_prelocked), Base.PkgId, String})
precompile(Tuple{typeof(Base.first), Array{Any, 1}})
precompile(Tuple{typeof(VSCodeServer.unwrap), VSCodeServer.Wrapper})
precompile(Tuple{REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool}, Any})

julia> precompile(Tuple{typeof(VSCodeServer.getvariables), Bool})
precompile(Tuple{typeof(TableTraits.isiterabletable), Module})
precompile(Tuple{typeof(VSCodeServer.wsicon), Nothing})
precompile(Tuple{Type{VSCodeServer.SubTree}, String, String, Nothing})
precompile(Tuple{typeof(VSCodeServer.treerender), Nothing})
precompile(Tuple{typeof(Base.show), Base.IOContext{VSCodeServer.LimitIO{Base.GenericIOBuffer{Array{UInt8, 1}}}}, Nothing})
precompile(Tuple{typeof(VSCodeServer.can_display), Nothing})
precompile(Tuple{typeof(TableTraits.isiterabletable), Nothing})
precompile(Tuple{typeof(VSCodeServer.JSON.Writer.show_pair), VSCodeServer.JSON.Writer.CompactContext{Base.GenericIOBuffer{Array{UInt8, 1}}}, VSCodeServer.JSON.Serializations.StandardSerialization, Symbol, Nothing})
julia> 

(note that there are extra julia> prompts, but they are likely because tracing is does printing asynchronously)

@KristofferC
Copy link
Contributor

KristofferC commented Jul 5, 2023

Probably these then

precompile(Tuple{typeof(Base.require), Module, Symbol})
precompile(Tuple{typeof(Base._require_prelocked), Base.PkgId, String})

Something has invalidated the package loading code itself. Should be fixed on 1.10 at least (package loading code is shielded from invalidations there).

@bkamins
Copy link
Member Author

bkamins commented Jul 5, 2023

OK. Thank you. So in summary. We can close this discussion in DataFrames.jl and the precompilation issue:

  1. Should be tracked in VSCode extension as it is general issue.
  2. It should be fixed in Julia 1.10.

Thank you!

(so those who are interested in this PR can unfollow it, as likely now we will discuss with @nalimilan what we should have in precompilation statements for the next release)

@bkamins
Copy link
Member Author

bkamins commented Jul 6, 2023

@nalimilan - so now concentrating on the proposed change under Julia 1.9.2.

So the proposed change: increases initial compilation by 16 seconds and load time by 0.2 seconds.

What the changes are (in short). Additionally cover:

  • InlineStrings.jl;
  • PooledArrays.jl;
  • Int32 integers.

In precompilation (they are common when working with CSV and large files). The question is if we want to pay this cost.

The detailed timings are:

Prepared release 1.6

Initial compilation time

$ julia --project -e "@time using DataFrames"
 58.688387 seconds (1.45 M allocations: 91.935 MiB, 0.17% gc time, 0.09% compilation time)

imports time after initial compilation

julia> @time_imports using DataFrames
      1.0 ms  Statistics
      0.5 ms  Reexport
      0.3 ms  Compat
      0.3 ms  Compat → CompatLinearAlgebraExt
      6.5 ms  OrderedCollections
     40.1 ms  DataStructures
      0.9 ms  SortingAlgorithms
      1.3 ms  DataAPI
     11.4 ms  PooledArrays
      9.1 ms  Missings
      1.8 ms  InvertedIndices
      0.3 ms  IteratorInterfaceExtensions
      0.2 ms  TableTraits
      0.6 ms  Formatting
      0.2 ms  DataValueInterfaces
     22.6 ms  Tables
    273.7 ms  StringManipulation
     39.1 ms  Crayons
      0.7 ms  LaTeXStrings
     70.1 ms  PrettyTables
     10.5 ms  Preferences
      0.4 ms  PrecompileTools
     23.1 ms  SentinelArrays
     27.6 ms  Parsers
      5.5 ms  InlineStrings
    818.0 ms  DataFrames

total time after initial compilation

$ julia -e "@time using DataFrames"
  1.409258 seconds (1.43 M allocations: 89.904 MiB, 6.59% gc time, 0.29% compilation time)

Current release 1.5

Initial compilation time

$ julia -e "@time using DataFrames"
 42.755615 seconds (1.25 M allocations: 78.180 MiB, 0.20% gc time, 0.12% compilation time)

imports time after initial compilation

julia> @time_imports using DataFrames
      1.1 ms  Statistics
      0.3 ms  Reexport
      0.4 ms  Compat
      0.3 ms  Compat → CompatLinearAlgebraExt
      5.3 ms  OrderedCollections
     41.2 ms  DataStructures
      1.1 ms  SortingAlgorithms
      0.8 ms  DataAPI
     15.9 ms  PooledArrays
      9.6 ms  Missings
      1.8 ms  InvertedIndices
      0.2 ms  IteratorInterfaceExtensions
      0.3 ms  TableTraits
      0.9 ms  Formatting
      0.3 ms  DataValueInterfaces
     33.2 ms  Tables
    313.1 ms  StringManipulation
     40.5 ms  Crayons
      0.7 ms  LaTeXStrings
     80.6 ms  PrettyTables
     10.7 ms  Preferences
      0.5 ms  SnoopPrecompile
     26.5 ms  SentinelArrays
      0.6 ms  PrecompileTools
     90.2 ms  Parsers
      5.3 ms  InlineStrings
    623.5 ms  DataFrames

total time after initial compilation

$ julia -e "@time using DataFrames"
  1.244608 seconds (1.23 M allocations: 76.564 MiB, 5.42% gc time, 0.33% compilation time)

@jariji
Copy link
Contributor

jariji commented Jul 6, 2023

It would be useful to quantify the benefit too (in addition to the cost shown above). That is, how much faster are common operations?

@bkamins
Copy link
Member Author

bkamins commented Jul 6, 2023

That is, how much faster are common operations?

This depends on what one sees as "common". As commented, the difference will be visible if you use CSV.jl. E.g. even one groupby by PooledVector of InlineString will already justify the difference:

DataFrames.jl 1.5

julia> using DataFrames

julia> using CSV

julia> str = """
       a,b
       xx,1.0
       xx,2.0
       xx,3.0
       yy,4.0
       yy,5.0
       yy,6.0
       yy,7.0
       """;

julia> @time df = CSV.read(IOBuffer(str), DataFrame);
  0.102912 seconds (69.56 k allocations: 4.470 MiB, 98.37% compilation time)

julia> @time gdf = groupby(df, :a);
  0.633322 seconds (1.04 M allocations: 69.960 MiB, 6.65% gc time, 99.97% compilation time)

julia> @time combine(gdf, :b => sum∘skipmissing);
  0.241224 seconds (580.24 k allocations: 30.936 MiB, 4.18% gc time, 99.64% compilation time: 49% of which was recompilation)

DataFrames.jl 1.6 proposal

julia> using DataFrames

julia> using CSV

julia> str = """
       a,b
       xx,1.0
       xx,2.0
       xx,3.0
       yy,4.0
       yy,5.0
       yy,6.0
       yy,7.0
       """;

julia> @time df = CSV.read(IOBuffer(str), DataFrame);
  0.097152 seconds (54.35 k allocations: 3.436 MiB, 98.10% compilation time)

julia> @time gdf = groupby(df, :a);
  0.000558 seconds (53 allocations: 3.391 KiB)

julia> @time combine(gdf, :b => sum∘skipmissing);
  0.135097 seconds (470.23 k allocations: 23.416 MiB, 99.50% compilation time: 79% of which was recompilation)

While if you do not use CSV.jl then most likely you do not use InlineStrings.jl and PooledArrays.jl (at least a typical user) and it will be just a cost for you.

So, again, in short - this is an optimization for CSV.jl users.

@jariji
Copy link
Contributor

jariji commented Jul 6, 2023

even one groupby by PooledVector of InlineString will already justify the difference

I might be misunderstanding, but if the user does a second groupby, won't it be compiled already because of the first execution? So I don't understand why you say "even" there.

@bkamins
Copy link
Member Author

bkamins commented Jul 6, 2023

you say "even" there.

because you can do other operations than groupby and compilation if groupby is conditional on column type, so another groupby on another InlineStrings type would have the same.

Second groupby with identical data types passed will indeed be cached.

@nilshg
Copy link
Contributor

nilshg commented Jul 7, 2023

I haven't thought about this too deeply, but my gut feel is that the increase in compilation time is too large to warrant this.

As you say this is for CSV users only so many use cases won't benefit at all, and equally one should keep in mind that in practice compilation doesn't tend to be a one time cost, as eg transitory dependencies update and force a recompilation, people use Pluto notebooks which create new and often slightly different environments, and there might still be issues with false positives when detecting stale cache files and that triggering recompilation.

What are the timings on 1.10/nightly? Anecdotally compilation times have come down quite a bit so might that make the tradeoff more palatable?

@bkamins
Copy link
Member Author

bkamins commented Jul 9, 2023

I have made some more benchmarking. My conclusion is to stay with the status quo, given the discussion. When Julia 1.10 is out we can make some additional testing and make a change if we decide we want to (it will be a patch, as it will not change anything).

Here is the timing of the status quo (current state of this PR) on Julia 1.9.2 vs Julia 1.10-alpha1 (in general Julia 1.10 promises indeed faster precompilation and load times):

Julia 1.9.2

julia> @time using DataFrames # precompilation time
[ Info: Precompiling DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
 37.531438 seconds (1.25 M allocations: 77.652 MiB, 0.14% gc time, 0.13% compilation time)

$ julia --project -e "@time using DataFrames" # load time
  1.119158 seconds (1.22 M allocations: 76.813 MiB, 6.35% gc time, 0.34% compilation time)

Julia 1.10-alpha1

julia> @time using DataFrames # precompilation time
Precompiling DataFrames
  1 dependency successfully precompiled in 28 seconds. 27 already precompiled.
 29.642817 seconds (4.42 M allocations: 319.372 MiB, 0.40% gc time, 3.56% compilation time)

$ julia --project -e "@time using DataFrames" # load time
  0.767517 seconds (948.04 k allocations: 93.187 MiB, 6.48% gc time, 1.17% compilation time)

In summary - my proposal would be to merge this PR as it is now (only bumping package version) and go back to the precompilation discussion after Julia 1.10 is out.

After this PR is merged I would make 1.6 release of DataFrames.jl.

@bkamins bkamins merged commit 8ba2288 into main Jul 9, 2023
6 of 7 checks passed
@bkamins bkamins deleted the bk/release16 branch July 9, 2023 20:54
@bkamins
Copy link
Member Author

bkamins commented Jul 9, 2023

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecosystem Issues in DataFrames.jl ecosystem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants