Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to solve lib/enclave.cc:404(8388) CHECK FAILED: ctl_fd_ >= 0 [-1 > 0] problem ? #8

Closed
NGUETOUM opened this issue Jan 21, 2022 · 37 comments

Comments

@NGUETOUM
Copy link

Hello,
Please excuse me to reopen this issue but I have meet a similar error when I had try to run the command bazel run agent_shinjuku in a root mode. this is the error that I obtain:
image
I think that it because I don't have a right to write into /sys/fs/ghost/ folder.
Please I don't know how I can change the permission of that folder, since when I try to run the command: sudo chmod -R 777 /sys/fs/ghost/ I just obtain Operation is not permitted as result.
Please I need your help.

Originally posted by @NGUETOUM in #2 (comment)

@NGUETOUM
Copy link
Author

I am using Ubuntu 20.04, and my version of ghost-kernel is 5.11.0+

@jackhumphries
Copy link
Collaborator

jackhumphries commented Jan 21, 2022

It seems that the call to LocalEnclave::MakeNextEnclave() is returning -1 in LocalEnclave::MakeAndAttachToEnclave(). Can you determine which line in LocalEnclave::MakeNextEnclave() is failing and post here?

@NGUETOUM
Copy link
Author

This line: int top_ctl =
open(absl::StrCat(Ghost::kGhostfsMount, "/ctl").c_str(), O_WRONLY);
image
I have try to print the content of /sys/fs/ghost folder but this folder is empty and I don't have permission to do anything on it.
image
I thing that this folder is still empty because during the execution the program don't have permission to create a local enclave.

@brho
Copy link
Collaborator

brho commented Jan 21, 2022

looks like ghostfs isn't mounted for some reason.

what do you get from this:

mount -t ghost ghost /sys/fs/ghost
ls -l /sys/fs/ghost/

@NGUETOUM
Copy link
Author

Thanks, @brho
Your command solve my problem. this is the output:
image
Now I reach to compile angent shinjuku:
image
thanks for your help.

@NGUETOUM
Copy link
Author

this is the output of the command:
image

@brho
Copy link
Collaborator

brho commented Jan 24, 2022 via email

@NGUETOUM
Copy link
Author

Good Morning @brho, I am sorry for my silence.
when I mount ghostfs I am able to run the agent, but after umount ghostfs , I still get the same error.
When I mount ghostfs and running the agent by using this command: bazel run fifo_agent --run_under='strace -f -o /home/armel/Desktop/ghost/output_ghostfs_mount.txt'
The following file is the output:

output_ghostfs_mount.txt

When I umount ghostfs and running the agent by using this commande: bazel run fifo_agent --run_under='strace -f -o /home/armel/Desktop/ghost/output_ghostfs_umount.txt'
The following file is the output:
output_ghostfs_umount.txt

Now I don't know why the agent is not able to mount ghostfs itself.

@jackhumphries
Copy link
Collaborator

It looks like the strace output for the second case does not include a call to mount(). This call is in lib/ghost.cc. Are you able to manually debug the code a little bit more to see why this call is being skipped? For example, is something happening in GhostIsMountedAt() that is causing the agent process to think ghostfs is already mounted?

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Hello @jackhumphries,
Please leave me a few minutes to check it.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Hi @jackhumphries,
Yes I have try to expect the code and check something in GhostIsMountedAt() function and in MountGhostfs() function. but to solve this issue I have just call manually Ghost::MountGhostfs(); at the beginning of the function int LocalEnclave::MakeNextEnclave() as you can see in the following image:
image
This have solve my problem. now i don't need to type mount -t ghost ghost /sys/fs/ghost command manually to mount ghostfs before to run some schedulers.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

But please excuse me to disturb you,
During my work I have meet another error.
When I run shinjuku scheduler by using bazel run agent_shinjuku command, the scheduler runnning normaly and I reach to
Initialization complete, ghOSt active. phase.
I have used the tool pushtosched.c to make migration of userspace application threads to ghost userspace, but I got the following error:

agent_shinjuku_error
In issues/1 you have already mentioned that error and you have assume that since Shinjuku requires the client app being scheduled to set up a shared memory region for communication (check out the RocksDB experiment), it is crashing in your case because this region is not set up.:
image
Please can you tell me explicitely what I am suppose to do in RocksDB experiment to solve this problem????
I am using the word_count-pthread application of phoenix map reduce as userspace application.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

I just want to say that I have reach to make threads migration by using Fifo scheduler using the same .word_count-pthread application of phoenix map reduce.

@jackhumphries
Copy link
Collaborator

jackhumphries commented Feb 3, 2022

For the mount issue, can you look at why GhostIsMountedAt() is returning true even though ghostfs is not mounted? Presumably that is why MountGhostfs() is not being called.

For Shinjuku, you are seeing the segfault because the application needs to set up a PrioTable before moving its threads to ghOSt. This is also necessary for the EDF scheduler. I would stick with the SOL scheduler and the two FIFO schedulers for anything else since they work with applications out of the box, i.e., they do not require a PrioTable.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Ok about GhostIsMountedAt() now I want to check it now. but how I suppose to set up a Prio Table of my application before to make migration please?

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

And the error that I got when I run EDF scheduler is different. I want firstly to finish with shinjuku scheduler before that we talk about the EDF scheduler please.

@jackhumphries
Copy link
Collaborator

You should check out the RocksDB and antagonist apps in experiments/. They use the library code in experiments/shared/ to set up a PrioTable and write thread information into it.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Ok. Please let me a few minutes to check firstly GhostIsMountedAt().

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Hello @jackhumphries,
I am very sorry if I had take a lot of hours,
But I have try to make some test on GhostIsMountedAt() and I think that this function work perfectly when the ghostfs isn't mounted it return false and when the ghostfs is mounted it return true but we are never check it in lib/enclave.cc.
Please if I tell something wrong tell me. In my own case I thing that the agent is not able to mount ghostfs itself because in
lib/enclave.cc we have never firstly check if the ghostfs is already mount, and then if not mount it.
Me I think that for the future we can add additional code on int LocalEnclave::MakeNextEnclave() function to make some check.
I am propose the following code at the beginning of int LocalEnclave::MakeNextEnclave() function:
if(Ghost::GhostIsMountedAt(Ghost::kGhostfsMount) != 1){
Ghost::MountGhostfs();
}

as you can see in the following picture:
image

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

In my computer this code work perfectly.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

And I don't need to mount manually the ghostfs.

@jackhumphries
Copy link
Collaborator

jackhumphries commented Feb 3, 2022

The ghostfs mount code is present in Ghost::GetVersion(). Do you see Ghost::GetVersion() being called in your code? It should be called by Ghost::CheckVersion(), which in turn is called on process startup since the return value of that function is used to initialize Agent::kVersionCheck, which is a static variable.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

Please but in my code I don't see Ghost::GetSupportedVersions() function.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

The function that I have is Ghost::GetVersion, but this function is never call on lib/enclave.cc

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 3, 2022

The int Ghost::GetVersion(uint64_t& version) call bool Ghost::GhostIsMountedAt(const char* path) but int Ghost::GetVersion(uint64_t& version) function is never call on lib/enclave.cc

@jackhumphries
Copy link
Collaborator

jackhumphries commented Feb 3, 2022

Sorry, made a typo. See my updated comment. As I said, Agent::kVersionCheck is a static variable initialized on startup to the return value of Ghost::CheckVersion(), which calls Ghost::GetVersion(), which does the mount.

If you add print statements in, do you not see anything printed out on agent startup?

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

Please excuse but I have forgot for why reason I have comment that line of code in lib/agent.cc. but my problem is that I am not see the definition of Ghost::CheckVersion(); function in my code. the lib/agent.cc just used it but I don't know where is it define

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

When I uncomment that line when I try to run fifo scheduller I got the following error:
./lib/ghost.h:236(7438) CHECK FAILED: kernel_abi_version == 49 [45 != 49]
PID 7438 Backtrace:
[0] 0x55fa3fc5a8b7 : ghost::Exit()
[1] 0x55fa3fbfe72e : ghost::Ghost::CheckVersion()
[2] 0x55fa3fbfddd7 : __static_initialization_and_destruction_0()
[3] 0x55fa3fbfddfa : _GLOBAL__sub_I_agent.cc
[4] 0x55fa3fd0be3d : __libc_csu_init
May be I have comment that line of code to solve that problem.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

The version of ghost kernel is may be lower than the version of ghost userspace.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

How I suppose now to solve that issue?

@jackhumphries
Copy link
Collaborator

ghost::CheckVersion() is defined in lib/ghost.h.

The userspace code and ghOSt code share the ghOSt UAPI header file in kernel/ghost_uapi.h (in this repo) and include/uapi/linux/ghost.h (in the kernel repo). We enforce that the version of each must match. If they do not match, then the header files are out of sync (e.g., type definitions may be missing or differ, etc.) plus the kernel may be behaving in a way that is different from what userspace expects.

To fix this, you need to make sure the two versions match, generally by just pulling in the most recent commits in both repos and then compiling and installing the updated kernel. The most recent version number is 55 (see #define GHOST_VERSION 55 in the header).

@NGUETOUM NGUETOUM closed this as completed Feb 4, 2022
@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

Please excuse me. This is just an error I have never close the issue

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

Ok let me see.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

In my repo the kernel have version 45.
It mean that I am suppose to pulling a newest version of the kernel, compile it and install il before to solve my issue???

@jackhumphries
Copy link
Collaborator

Yes, and make sure you also pull the latest version of the code in the userspace repo, too. You'll notice that #define GHOST_VERSION has the same version number once you pull the latest code in both repositories.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

Ok. I try it now.

@NGUETOUM
Copy link
Author

NGUETOUM commented Feb 4, 2022

Hi @jackhumphries,
I have pull the latest version of ghost-kernel and ghost-userspace in github, now the agent is able to mount ghostfs itself. the
problem came from me, because I had comment the line which checked the ghost version. I am very sorry for my mistake and Thanks you verry much for your patience and your attention.
Now I want to look the RocksDB and antagonist in experiments to see how I can set the Prio Table before make threads migration with shinjuku scheduler as you had say yesterday.
Thank you too much Sir.

hannahyp pushed a commit that referenced this issue Mar 18, 2022
Specifically the bug is that we are depending on the initialization of
GHOST_TID_SEQNUM_BITS (in base.cc) during the initialization of another
global Agent::kVersionCheck (in agent.cc). Since there is no ordering
across compilation units this can result in accessing GHOST_TID_SEQNUM_BITS
before it has been initialized.

READ of size 4 at 0x7ff981751180 thread T0
    #0 0x7ff98134368f in ghost::gtid(long) third_party/ghost/lib/base.cc:122:19
    #1 0x7ff98134196d in GetGtid third_party/ghost/lib/base.cc:212:28
    #2 0x7ff98134196d in Current third_party/ghost/lib/base.h:164:40
    #3 0x7ff98134196d in ghost::Exit(int) third_party/ghost/lib/base.cc:285:28
    #4 0x7ff981b93db4 in ghost::Ghost::MountGhostfs() third_party/ghost/lib/ghost.cc:147:5
    #5 0x7ff981b93fe5 in ghost::Ghost::GetSupportedVersions(std::__u::vector<unsigned int, std::__u::allocator<unsigned int> >&) third_party/ghost/lib/ghost.cc:155:5
    #6 0x7ff982102744 in ghost::Ghost::CheckVersion() third_party/ghost/lib/ghost.h:274:5
    #7 0x7ff98217c564 in __cxx_global_var_init third_party/ghost/lib/agent.cc:102:35
    #8 0x7ff98217c564 in _GLOBAL__sub_I_agent.cc third_party/ghost/lib/agent.cc
    #9 0x7ff9cc37b4cc in call_init (/usr/grte/v5/lib64/ld-linux-x86-64.so.2+0x1c4cc)
    #10 0x7ff9cc37b329 in _dl_init (/usr/grte/v5/lib64/ld-linux-x86-64.so.2+0x1c329)

Fix this by caching the result in a function-local static variable in
get_tid_seqnum_bits() which is guaranteed to be initialized the first time
it is accessed, and this initialization will happen before any other access.

TESTED=all units tests pass in virtme
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants