Skip to content

Core.stdout isn't scalable; gets slower over time (with high/er loopcount) #59325

@PallHaraldsson

Description

@PallHaraldsson

EDIT: It IS scalable (superlinearally; both Core.stdout, and stdout, just differently), I explained my mistaken assumption explained on discourse, that led me to the wrong conclusion, so I'm closing here.

[Important for juliac]

It’s up to at least 15x slower.

[EDIT: I see now it being slower is explained by it not buffering internally, doing a syscall for each print. Maybe it can be made to not to that, i.e. buffer too? There's probably no memory leak as I first suspected, and I want to double check it is actually not scalable, or I measured incorrectly, as opposed to maybe not just the Linux kernel, it seems to rate-limit (no it only does that for stdout, not Core.stdout), so maybe nothing to do here, other than maybe add buffering.]

vs 5x or smaller with lower loop count than:

for _ in 1:200000; print(Core.stdout, \"Hello\"); end

I suspect a memory leak (any hidden allocations, like with regular print with stdout)? Julia doesn't show them when calling malloc. What else could it be?

I opened this thread and longer test on it here:
https://discourse.julialang.org/t/regular-println-vs-core-stdout/131685/7?u=palli

Eventually jl_uv_puts is called that uses

malloc_s(sizeof(uv_write_t) + n)

and thus malloc since

STATIC_INLINE void *malloc_s(size_t sz) JL_NOTSAFEPOINT {
(and memcpy here):

https://github.com/mmtk/julia/blob/b35c4f471f05873a4de6a989e0038a5c28cd09ce/src/jl_uv.c#L701

i.e.

s = "Hello"; ccall(:jl_uv_puts, Cvoid, (Ptr{Cvoid}, Ptr{UInt8}, UInt), io_pointer(stdout), pointer(s), sizeof(s))

after implicitly doing:

julia> abstract type IO end
julia> struct CoreSTDOUT <: IO end

julia> const Core_stdout = CoreSTDOUT()  # Core.stdout in Julia's source
CoreSTDOUT()

julia> io_pointer(::CoreSTDOUT) = Core.Intrinsics.pointerref(Core.Intrinsics.cglobal(:jl_uv_stdout, Ptr{Cvoid}), 1, 1)  # implicit, slight modification from Julia's source

julia> ccall(:jl_uv_puts, Cvoid, (Ptr{Cvoid}, Ptr{UInt8}, UInt), io_pointer(Core_stdout), pointer(s), sizeof(s))
Hello

Copied and modified from this source:

abstract type IO end
struct CoreSTDOUT <: IO end
struct CoreSTDERR <: IO end
const stdout = CoreSTDOUT()
const stderr = CoreSTDERR()
io_pointer(::CoreSTDOUT) = Intrinsics.pointerref(Intrinsics.cglobal(:jl_uv_stdout, Ptr{Cvoid}), 1, 1)
io_pointer(::CoreSTDERR) = Intrinsics.pointerref(Intrinsics.cglobal(:jl_uv_stderr, Ptr{Cvoid}), 1, 1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions