Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue(fs): assertion !frame_is_full(ctx->frame) failed at /nanos/src/kernel/mutex.c:109 (IP 0xffffffff800460b4) #1933

Closed
rinor opened this issue Sep 1, 2023 · 2 comments · Fixed by #1934
Assignees

Comments

@rinor
Copy link
Contributor

rinor commented Sep 1, 2023

While re-testing #1845 after #1904 merged on main branch at 2da53c8 got:

frame trace: 
ffffc00000a47b98:   ffffffff800461ee	(mutex_lock + 000000000000000e/000000000000003f)
ffffc00000a47ba8:   ffffffff80097983	(filesystem_storage_write + 0000000000000073/0000000000000132)
ffffc00000a47c08:   ffffffff8004ee81	(pagecache_commit_dirty_ranges + 0000000000000611/0000000000000eb0)
ffffc00000a47d98:   ffffffff8004a1f1	(pagecache_commit_dirty_node + 0000000000000181/0000000000000355)
ffffc00000a47e28:   ffffffff8005022d	(pagecache_sync_volume + 00000000000000fd/00000000000002c2)
ffffc00000a47ea8:   ffffffff800580e5	(storage_sync + 0000000000000045/00000000000000ef)
ffffc00000a47ee8:   ffffffff800b5459	(demand_anonymous_page + 00000000000000e9/0000000000000257)
ffffc00000a47f38:   ffffffff800ebb27	(unix_fault_handler + 00000000000001d7/0000000000001027)
ffffc00000a47f98:   ffffffff8011be5d	(common_handler + 000000000000038d/00000000000005c7)
ffffc00000a47fe8:   ffffffff800010ed
ffffc0000208ff20:   ffffffff80080834	(sg_copy_to_buf_and_release + 0000000000000114/0000000000000203)
ffffc0000208ff70:   ffffffff800db310	(file_read_complete + 00000000000000a0/00000000000003cd)
ffffc0000208ffc0:   ffffffff800414ef	(context_switch_finish + 000000000000006f/00000000000001b4)

loaded klibs: 
assertion !frame_is_full(ctx->frame) failed at /usr/local/src/nanos/src/kernel/mutex.c:109 (IP 0xffffffff800460b4)  in mutex_lock_internal(); halt

Note: This (similar) was observed also on #1874 but on other conditions and fixed at #1875


config.json
{
    "BaseVolumeSz": "4m",
    "Args": [
        "-w=10",
        "-f=1",
        "-s=128",
        "-t=0",
        "-c=false",
        "-d=/tfs_mnt"
    ],
    "Mounts": {
        "stressdisk": "tfs_mnt"
    },
    "RunConfig": {
        "Memory": "614m"
    },
    "Env": {
        "GODEBUG": "gctrace=0,scavtrace=0"
    },
    "Boot": "/usr/local/src/nanos/output/platform/pc/boot/boot.img",
    "Kernel": "/usr/local/src/nanos/output/platform/pc/bin/kernel.img",
    "KlibDir": "/usr/local/src/nanos/output/klib/bin",
    "Debugflags": [
        "reboot_on_exit"
    ]
}
ops volume create stressdisk -s 1G # extra volume clean storage
ops volume create stressdisk -s 1G
ops run -c config.json stressdisk --smp=4 #with clean storage
ops run -c config.json stressdisk --smp=4

running local instance
booting /home/rinor/.ops/images/stressdisk ...
en1: assigned 10.0.2.15
2023/09/01 16:53:06 /proc/meminfo
MemTotal:        623680 kB
MemFree:         565996 kB
MemAvailable:    567432 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:            1436 kB

2023/09/01 16:53:06 WAITING - 3 seconds ...
en1: assigned FE80::D0E3:EDFF:FE83:40D
2023/09/01 16:53:09 WAITING - done
2023/09/01 16:53:09 
2023/09/01 16:53:09 STEP(1): Checking for existing data on (/tfs_mnt)
2023/09/01 16:53:09 
2023/09/01 16:53:09 
2023/09/01 16:53:09 
2023/09/01 16:53:09 STEP(2): Trying to write (134217728 bytes => 262144 blocks)/file on (/tfs_mnt)
2023/09/01 16:53:09 
2023/09/01 16:53:10 
2023/09/01 16:53:10 STEP(2): Approx (7) file(s) expected to succeed, (1) additional file(s) may be attempted
2023/09/01 16:53:10 STEP(2): /tfs_mnt/TST_0001_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:10 /proc/meminfo
MemTotal:        623680 kB
MemFree:         295408 kB
MemAvailable:    427976 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          132568 kB

2023/09/01 16:53:10 STEP(2): /tfs_mnt/TST_0002_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:10 /proc/meminfo
MemTotal:        623680 kB
MemFree:         158868 kB
MemAvailable:    422508 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          263640 kB

2023/09/01 16:53:11 STEP(2): /tfs_mnt/TST_0003_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:11 /proc/meminfo
MemTotal:        623680 kB
MemFree:          20288 kB
MemAvailable:    414528 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          394240 kB

2023/09/01 16:53:18 STEP(2): /tfs_mnt/TST_0004_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:18 /proc/meminfo
MemTotal:        623680 kB
MemFree:          66888 kB
MemAvailable:    330016 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          263128 kB

2023/09/01 16:53:19 STEP(2): /tfs_mnt/TST_0005_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:19 /proc/meminfo
MemTotal:        623680 kB
MemFree:          70984 kB
MemAvailable:    334152 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          263168 kB

2023/09/01 16:53:19 STEP(2): /tfs_mnt/TST_0006_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:19 /proc/meminfo
MemTotal:        623680 kB
MemFree:          22344 kB
MemAvailable:    416584 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          394240 kB

2023/09/01 16:53:23 STEP(2): /tfs_mnt/TST_0007_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71 (134217728 B) - OK
2023/09/01 16:53:23 /proc/meminfo
MemTotal:        623680 kB
MemFree:          66504 kB
MemAvailable:    344640 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          278136 kB

2023/09/01 16:53:23 STEP(2): write /tfs_mnt/TST_0008_a5672d94d89fe35f1eeed8767dbf531d32b8cc71: no space left on device
2023/09/01 16:53:23 /proc/meminfo
MemTotal:        623680 kB
MemFree:          66496 kB
MemAvailable:    344636 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          278140 kB

2023/09/01 16:53:23 
2023/09/01 16:53:23 STEP(3): Read files and calculate checksum on (/tfs_mnt)
2023/09/01 16:53:23 
2023/09/01 16:53:23 
2023/09/01 16:53:23 STEP(3): /tfs_mnt/TST_0001_a5672d94d89fe35f1eeed8767dbf531d32b8cc71 - a5672d94d89fe35f1eeed8767dbf531d32b8cc71  (134217728 B) - OK
2023/09/01 16:53:23 /proc/meminfo
MemTotal:        623680 kB
MemFree:          65880 kB
MemAvailable:    256928 kB
Buffers:              0 kB
SwapTotal:            0 kB
SwapFree:             0 kB
Cached:          191048 kB


frame trace: 
ffffc00000a47b98:   ffffffff800461ee	(mutex_lock + 000000000000000e/000000000000003f)
ffffc00000a47ba8:   ffffffff80097983	(filesystem_storage_write + 0000000000000073/0000000000000132)
ffffc00000a47c08:   ffffffff8004ee81	(pagecache_commit_dirty_ranges + 0000000000000611/0000000000000eb0)
ffffc00000a47d98:   ffffffff8004a1f1	(pagecache_commit_dirty_node + 0000000000000181/0000000000000355)
ffffc00000a47e28:   ffffffff8005022d	(pagecache_sync_volume + 00000000000000fd/00000000000002c2)
ffffc00000a47ea8:   ffffffff800580e5	(storage_sync + 0000000000000045/00000000000000ef)
ffffc00000a47ee8:   ffffffff800b5459	(demand_anonymous_page + 00000000000000e9/0000000000000257)
ffffc00000a47f38:   ffffffff800ebb27	(unix_fault_handler + 00000000000001d7/0000000000001027)
ffffc00000a47f98:   ffffffff8011be5d	(common_handler + 000000000000038d/00000000000005c7)
ffffc00000a47fe8:   ffffffff800010ed
ffffc0000208ff20:   ffffffff80080834	(sg_copy_to_buf_and_release + 0000000000000114/0000000000000203)
ffffc0000208ff70:   ffffffff800db310	(file_read_complete + 00000000000000a0/00000000000003cd)
ffffc0000208ffc0:   ffffffff800414ef	(context_switch_finish + 000000000000006f/00000000000001b4)

loaded klibs: 
assertion !frame_is_full(ctx->frame) failed at /usr/local/src/nanos/src/kernel/mutex.c:109 (IP 0xffffffff800460b4)  in mutex_lock_internal(); halt
@rinor rinor changed the title issue(fs): assertion !frame_is_full(ctx->frame) failed at /usr/local/src/nanos/src/kernel/mutex.c:109 (IP 0xffffffff800460b4) issue(fs): assertion !frame_is_full(ctx->frame) failed at /nanos/src/kernel/mutex.c:109 (IP 0xffffffff800460b4) Sep 1, 2023
@francescolavra francescolavra self-assigned this Sep 5, 2023
francescolavra added a commit that referenced this issue Sep 5, 2023
A call to storage_sync() may suspend the current context, e.g. to
acquire the filsystem mutex. Therefore, this function cannot be
called directly from the code that handles a page fault exception;
instead, an asynchronus thunk should be used.
This commit changes the mmap_anon_page() closure function so that
it calls storage_sync() first, and the demand_anonymous_page()
function so that it invokes the above closure via async_apply_bh().

Closes #1933.
francescolavra added a commit that referenced this issue Sep 5, 2023
A call to storage_sync() may suspend the current context, e.g. to
acquire the filesystem mutex. Therefore, this function cannot be
called directly from the code that handles a page fault exception;
instead, an asynchronous thunk should be used.
This change modifies the mmap_anon_page() closure function so that
it calls storage_sync() first, and the demand_anonymous_page()
function so that it invokes the above closure via async_apply_bh().

Closes #1933.
@francescolavra
Copy link
Member

Sorry, I failed to re-test the changes in #1904 after rebasing. This issue is now fixed in #1934

@rinor
Copy link
Contributor Author

rinor commented Sep 6, 2023

tested #1934 and can confirm it works as expected. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants