Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing files doesn't work #7

Open
zvikfir opened this issue Jun 10, 2021 · 1 comment
Open

Removing files doesn't work #7

zvikfir opened this issue Jun 10, 2021 · 1 comment

Comments

@zvikfir
Copy link
Contributor

zvikfir commented Jun 10, 2021

Hi,

I have come across an issue where files can't be deleted from the /mlfs directory.

Here is an example of performing ls, followed by rm, and then ls again.

# ~kfirzv/bin/run_with_assise.sh ls -l /mlfs/

dev-dax engine is initialized: dev_path /dev/dax5.0 size 512000 MB
fetching node's IP address..
Process pid is 34257
ip address on interface 'lo' is 127.0.0.1
cluster settings:
--- node 0 - ip:127.0.0.1
Connecting to KernFS instance 0 [ip: 127.0.0.1]
[Local-Client] Creating connection (pid:34257, app_type:0, status:pending) to 127.0.0.1:12345 on sockfd 0
[Local-Client] Creating connection (pid:34257, app_type:1, status:pending) to 127.0.0.1:12345 on sockfd 1
[Local-Client] Creating connection (pid:34257, app_type:2, status:pending) to 127.0.0.1:12345 on sockfd 2
In thread
In thread
In thread
SEND --> MSG_INIT [pid 2|34257]
RECV <-- MSG_SHM [paths: /shm_recv_0|/shm_send_0]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:2 of type:2 and peer:0x7bf2a0
start shmem_poll_loop for sockfd 2
SEND --> MSG_INIT [pid 1|34257]
SEND --> MSG_INIT [pid 0|34257]
RECV <-- MSG_SHM [paths: /shm_recv_1|/shm_send_1]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:1 of type:1 and peer:0x7bf2a0
start shmem_poll_loop for sockfd 1
RECV <-- MSG_SHM [paths: /shm_recv_2|/shm_send_2]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:0 of type:0 and peer:0x7bf2a0
start shmem_poll_loop for sockfd 0
[signal_callback():1370] Assigned LibFS ID=1
MLFS cluster initialized
init log dev 1 start_blk 125564929 end 125827072
total 9216
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_0
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_1
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_2
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_3
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_4
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_5
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_6
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_7
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_8


# -----------------------------


# ~kfirzv/bin/run_with_assise.sh rm -rf /mlfs/mpi_hello_*

dev-dax engine is initialized: dev_path /dev/dax5.0 size 512000 MB
fetching node's IP address..
Process pid is 34287
ip address on interface 'lo' is 127.0.0.1
cluster settings:
--- node 0 - ip:127.0.0.1
Connecting to KernFS instance 0 [ip: 127.0.0.1]
[Local-Client] Creating connection (pid:34287, app_type:0, status:pending) to 127.0.0.1:12345 on sockfd 0
[Local-Client] Creating connection (pid:34287, app_type:1, status:pending) to 127.0.0.1:12345 on sockfd 1
In thread
[Local-Client] Creating connection (pid:34287, app_type:2, status:pending) to 127.0.0.1:12345 on sockfd 2
In thread
In thread
SEND --> MSG_INIT [pid 1|34287]
SEND --> MSG_INIT [pid 0|34287]
RECV <-- MSG_SHM [paths: /shm_recv_0|/shm_send_0]
RECV <-- MSG_SHM [paths: /shm_recv_1|/shm_send_1]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:1 of type:1 and peer:0x21ac2a0
start shmem_poll_loop for sockfd 1
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:0 of type:0 and peer:0x21ac2a0
start shmem_poll_loop for sockfd 0
SEND --> MSG_INIT [pid 2|34287]
RECV <-- MSG_SHM [paths: /shm_recv_2|/shm_send_2]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:2 of type:2 and peer:0x21ac2a0
start shmem_poll_loop for sockfd 2
[signal_callback():1370] Assigned LibFS ID=1
MLFS cluster initialized
init log dev 1 start_blk 125564929 end 125827072

# --------------------------

# ~kfirzv/bin/run_with_assise.sh ls -l /mlfs/

dev-dax engine is initialized: dev_path /dev/dax5.0 size 512000 MB
fetching node's IP address..
Process pid is 34306
ip address on interface 'lo' is 127.0.0.1
cluster settings:
--- node 0 - ip:127.0.0.1
Connecting to KernFS instance 0 [ip: 127.0.0.1]
[Local-Client] Creating connection (pid:34306, app_type:0, status:pending) to 127.0.0.1:12345 on sockfd 0
[Local-Client] Creating connection (pid:34306, app_type:1, status:pending) to 127.0.0.1:12345 on sockfd 1
[Local-Client] Creating connection (pid:34306, app_type:2, status:pending) to 127.0.0.1:12345 on sockfd 2
In thread
In thread
In thread
SEND --> MSG_INIT [pid 2|34306]
SEND --> MSG_INIT [pid 0|34306]
SEND --> MSG_INIT [pid 1|34306]
RECV <-- MSG_SHM [paths: /shm_recv_1|/shm_send_1]
RECV <-- MSG_SHM [paths: /shm_recv_0|/shm_send_0]
RECV <-- MSG_SHM [paths: /shm_recv_2|/shm_send_2]
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:0 of type:0 and peer:0xac22a0
start shmem_poll_loop for sockfd 0
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:2 of type:2 and peer:0xac22a0
start shmem_poll_loop for sockfd 2
[add_peer_socket():97] Established connection with 127.0.0.1 on sock:1 of type:1 and peer:0xac22a0
start shmem_poll_loop for sockfd 1
[signal_callback():1370] Assigned LibFS ID=1
MLFS cluster initialized
init log dev 1 start_blk 125564929 end 125827072
total 9216
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_0
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_1
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_2
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_3
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_4
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_5
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_6
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_7
---------- 1 root root 1048576 Jan  1  1970 mpi_hello_world_8

In addition, there seems to be files which are not deleted even after re-allocating the NVRAM between app-direct and memory-mode, re-creating the namespace, and after re-performing mkfs. For instance, performing rm -rf /mlfs/* gives an error that some of these files can't be deleted even though they shouldn't exist anymore.

Is there any way to clean the cache of Assise?

Thanks,
Kfir

@wreda
Copy link
Contributor

wreda commented Jun 24, 2021

The rm command is using unsupported syscalls, which is why it isn't doing anything. You may want to write a custom script that searches the desired directory (see libfs/tests/statdir_test.c) and then calls unlink.

The mkfs.sh script should wipe all filesystem data and metadata. If it isn't working as expected, I'd double-check that your binaries are up-to-date by cleaning and re-compiling LibFS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants