Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files unexpectedly released leading to Bad file descriptor #749

Closed
jchorl opened this issue Feb 13, 2024 · 5 comments
Closed

Files unexpectedly released leading to Bad file descriptor #749

jchorl opened this issue Feb 13, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@jchorl
Copy link
Contributor

jchorl commented Feb 13, 2024

Mountpoint for Amazon S3 version

1.4.0

AWS Region

ca-central-1

Describe the running environment

EC2, amazon linux 2. Mountpoint is running inside a privileged ubuntu docker container.

Mountpoint options

MOUNTPOINT_LOG=trace mount-s3 my-bucket mnt --prefix my-prefix/ --foreground -l mplogs

What happened?

Summary:

  • Calling a bash script that calls itself, served from mountpoint mount
  • It invokes a heavy disk/mem operation, unrelated to mountpoint
  • When the script returns to running, it is no longer accessible, leading to Bad file descriptor errors

The setup is a bit complicated. I'll put the files at the end because some are a bit large.

  1. A bash script that should be at least a few pages in length. It's important the whole thing not fit into a single page[0].
    • This script should go in my-bucket/my-prefix/
  2. An application that hammers disk + ram[1]. I wrote a simple C binary that reads a large file into memory.
  3. A large file. I used head -c 12G </dev/urandom > ~/big-file to generate.

It's important to note I was running on a r6i.large, with a gp3 vol, 250gb, 3000 iops, 125 throughput.

Here's how I ran it:

In one terminal, get mountpoint running:

docker run -it --rm --privileged ubuntu:23.10 bash
apt-get update; apt-get install -y gcc wget vim

# install mountpoint
wget https://s3.amazonaws.com/mountpoint-s3-release/latest/x86_64/mount-s3.deb
apt-get install -y ./mount-s3.deb

# prepare bin to stress disk/ram
cd
mkdir joshtest
vim main.c # contents later in report
gcc -o josh-mem-test main.c

# prep a big file to thrash disk/ram
head -c 12G </dev/urandom > ~/big-file

# copy script to bucket/prefix, contents later in report
aws s3 cp testscript.sh s3://my-bucket/my-prefix/

# get mountpoint running
mount-s3 my-bucket mnt --prefix my-prefix/ --foreground

In a separate terminal, exec in to the container and run the script

docker exec -it 040b2bcbb33c bash
bash ./mnt/testscript.sh f1
# outputs
# in f1
# in f2
# allocating a bunch of ram
# Tue Feb 13 06:26:43 GMT 2024
# Buffer size:
# 8
# done allocating a bunch of ram
# Tue Feb 13 06:28:26 GMT 2024
# done f2
# mnt/testscript.sh: error reading input file: Bad file descriptor
# done f2, back in f1
# ./mnt/testscript.sh: error reading input file: Bad file descriptor

Now for some observations:

  • This is a regression, in that I chased the breakage down using git bisect to 0030b0a
  • I decreased the size of the file being read into ram, and the behaviour was not exhibited. I enabled trace logging. This log is quite telling, after the disk-thrash has run:
    2024-02-13T04:37:25.247447Z DEBUG fuser::request: FUSE( 27) ino 0x0000000000000002 READ fh FileHandle(2), offset 4096, size 4096    
    

Here is my hypothesis as to what's going on:

  • Prior to the breaking commit, there was a handle to the script file
  • Reading from it worked just fine, even after disk/ram had been thrashed
  • That commit introduces a Closed status for the file-handle, after which reading no longer works
  • The bash script forking likely causes a FLUSH, closing the file, even though the bash script will be returned to. I'm sure the nested bash call doesn't make things easier!
  • I'm guessing disk caching is also at play. Because for a smaller disk-thrash-file, we never actually see another FUSE read for the script file. My bet is that the OS (or something else) has it cached. But the combination of evicting it from cache, + re-reading a closed file handle, causes bash to fail.

I'm not sure if this is a minimal reproduction, but I worked hard to get it into a state so hopefully you can run it too and provide guidance. You may be able to reproduce this behaviour using O_DIRECT or something easier.

Scripts used for testing/reproduction:
[0]:

#!/bin/bash

ENTRYPOINT=$1


f1() {
    echo "in f1"
    # echo 1
    # echo 2
    # echo 3
    # echo 4
    # echo 5
    # echo 6
    # echo 7
    # echo 8
    # echo 9
    # echo 10
    # echo 11
    # echo 12
    # echo 13
    # echo 14
    # echo 15
    # echo 16
    # echo 17
    # echo 18
    # echo 19
    # echo 20
    # echo 21
    # echo 22
    # echo 23
    # echo 24
    # echo 25
    # echo 26
    # echo 27
    # echo 28
    # echo 29
    # echo 30
    # echo 31
    # echo 32
    # echo 33
    # echo 34
    # echo 35
    # echo 36
    # echo 37
    # echo 38
    # echo 39
    # echo 40
    # echo 41
    # echo 42
    # echo 43
    # echo 44
    # echo 45
    # echo 46
    # echo 47
    # echo 48
    # echo 49
    # echo 50
    # echo 51
    # echo 52
    # echo 53
    # echo 54
    # echo 55
    # echo 56
    # echo 57
    # echo 58
    # echo 59
    # echo 60
    # echo 61
    # echo 62
    # echo 63
    # echo 64
    # echo 65
    # echo 66
    # echo 67
    # echo 68
    # echo 69
    # echo 70
    # echo 71
    # echo 72
    # echo 73
    # echo 74
    # echo 75
    # echo 76
    # echo 77
    # echo 78
    # echo 79
    # echo 80
    # echo 81
    # echo 82
    # echo 83
    # echo 84
    # echo 85
    # echo 86
    # echo 87
    # echo 88
    # echo 89
    # echo 90
    # echo 91
    # echo 92
    # echo 93
    # echo 94
    # echo 95
    # echo 96
    # echo 97
    # echo 98
    # echo 99
    # echo 100
    /bin/bash mnt/testscript.sh f2
    echo "done f2, back in f1"
    # echo 101
    # echo 102
    # echo 103
    # echo 104
    # echo 105
    # echo 106
    # echo 107
    # echo 108
    # echo 109
    # echo 110
    # echo 111
    # echo 112
    # echo 113
    # echo 114
    # echo 115
    # echo 116
    # echo 117
    # echo 118
    # echo 119
    # echo 120
    # echo 121
    # echo 122
    # echo 123
    # echo 124
    # echo 125
    # echo 126
    # echo 127
    # echo 128
    # echo 129
    # echo 130
    # echo 131
    # echo 132
    # echo 133
    # echo 134
    # echo 135
    # echo 136
    # echo 137
    # echo 138
    # echo 139
    # echo 140
    # echo 141
    # echo 142
    # echo 143
    # echo 144
    # echo 145
    # echo 146
    # echo 147
    # echo 148
    # echo 149
    # echo 150
    # echo 151
    # echo 152
    # echo 153
    # echo 154
    # echo 155
    # echo 156
    # echo 157
    # echo 158
    # echo 159
    # echo 160
    # echo 161
    # echo 162
    # echo 163
    # echo 164
    # echo 165
    # echo 166
    # echo 167
    # echo 168
    # echo 169
    # echo 170
    # echo 171
    # echo 172
    # echo 173
    # echo 174
    # echo 175
    # echo 176
    # echo 177
    # echo 178
    # echo 179
    # echo 180
    # echo 181
    # echo 182
    # echo 183
    # echo 184
    # echo 185
    # echo 186
    # echo 187
    # echo 188
    # echo 189
    # echo 190
    # echo 191
    # echo 192
    # echo 193
    # echo 194
    # echo 195
    # echo 196
    # echo 197
    # echo 198
    # echo 199
    # echo 200
}

f2() {
    echo "in f2"
    # echo 201
    # echo 202
    # echo 203
    # echo 204
    # echo 205
    # echo 206
    # echo 207
    # echo 208
    # echo 209
    # echo 210
    # echo 211
    # echo 212
    # echo 213
    # echo 214
    # echo 215
    # echo 216
    # echo 217
    # echo 218
    # echo 219
    # echo 220
    # echo 221
    # echo 222
    # echo 223
    # echo 224
    # echo 225
    # echo 226
    # echo 227
    # echo 228
    # echo 229
    # echo 230
    # echo 231
    # echo 232
    # echo 233
    # echo 234
    # echo 235
    # echo 236
    # echo 237
    # echo 238
    # echo 239
    # echo 240
    # echo 241
    # echo 242
    # echo 243
    # echo 244
    # echo 245
    # echo 246
    # echo 247
    # echo 248
    # echo 249
    # echo 250
    # echo 251
    # echo 252
    # echo 253
    # echo 254
    # echo 255
    # echo 256
    # echo 257
    # echo 258
    # echo 259
    # echo 260
    # echo 261
    # echo 262
    # echo 263
    # echo 264
    # echo 265
    # echo 266
    # echo 267
    # echo 268
    # echo 269
    # echo 270
    # echo 271
    # echo 272
    # echo 273
    # echo 274
    # echo 275
    # echo 276
    # echo 277
    # echo 278
    # echo 279
    # echo 280
    # echo 281
    # echo 282
    # echo 283
    # echo 284
    # echo 285
    # echo 286
    # echo 287
    # echo 288
    # echo 289
    # echo 290
    # echo 291
    # echo 292
    # echo 293
    # echo 294
    # echo 295
    # echo 296
    # echo 297
    # echo 298
    # echo 299
    # echo 300
    echo "allocating a bunch of ram"
    echo "$(date)"
    /root/joshtest/josh-mem-test /root/big-file
    echo "done allocating a bunch of ram"
    echo "$(date)"
    # echo 301
    # echo 302
    # echo 303
    # echo 304
    # echo 305
    # echo 306
    # echo 307
    # echo 308
    # echo 309
    # echo 310
    # echo 311
    # echo 312
    # echo 313
    # echo 314
    # echo 315
    # echo 316
    # echo 317
    # echo 318
    # echo 319
    # echo 320
    # echo 321
    # echo 322
    # echo 323
    # echo 324
    # echo 325
    # echo 326
    # echo 327
    # echo 328
    # echo 329
    # echo 330
    # echo 331
    # echo 332
    # echo 333
    # echo 334
    # echo 335
    # echo 336
    # echo 337
    # echo 338
    # echo 339
    # echo 340
    # echo 341
    # echo 342
    # echo 343
    # echo 344
    # echo 345
    # echo 346
    # echo 347
    # echo 348
    # echo 349
    # echo 350
    # echo 351
    # echo 352
    # echo 353
    # echo 354
    # echo 355
    # echo 356
    # echo 357
    # echo 358
    # echo 359
    # echo 360
    # echo 361
    # echo 362
    # echo 363
    # echo 364
    # echo 365
    # echo 366
    # echo 367
    # echo 368
    # echo 369
    # echo 370
    # echo 371
    # echo 372
    # echo 373
    # echo 374
    # echo 375
    # echo 376
    # echo 377
    # echo 378
    # echo 379
    # echo 380
    # echo 381
    # echo 382
    # echo 383
    # echo 384
    # echo 385
    # echo 386
    # echo 387
    # echo 388
    # echo 389
    # echo 390
    # echo 391
    # echo 392
    # echo 393
    # echo 394
    # echo 395
    # echo 396
    # echo 397
    # echo 398
    # echo 399
    # echo 400
    echo "done f2"
}

$ENTRYPOINT

[1]:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    FILE *file;
    char *buffer;
    long file_size;

    // Open the file in binary read mode
    file = fopen(argv[1], "rb");
    if (file == NULL) {
        perror("Error opening file");
        return 1;
    }

    // Determine the file size
    fseek(file, 0, SEEK_END);
    file_size = ftell(file);
    rewind(file);

    // Allocate memory to store the file contents
    buffer = (char *)malloc(file_size * sizeof(char));
    if (buffer == NULL) {
        perror("Error allocating memory");
        fclose(file);
        return 1;
    }

    // Read the file contents into the buffer
    if (fread(buffer, sizeof(char), file_size, file) != file_size) {
        perror("Error reading file");
        fclose(file);
        free(buffer);
        return 1;
    }

    // Close the file
    fclose(file);

    // Use the file contents in the buffer
    printf("Buffer size:\n%lu\n", sizeof(buffer));

    // Free the allocated memory
    free(buffer);

    return 0;
}

Relevant log output

No response

@jchorl jchorl added the bug Something isn't working label Feb 13, 2024
@dannycjones
Copy link
Contributor

Thanks for the details on this, @jchorl. We'll look into this soon.

Quickly linking to another comment we saw a similar symptom in another issue: #706 (comment)

@SamStudio8
Copy link

Thanks @dannycjones, just a +1 in that we are also observing this. I was unable to reproduce it succinctly so kudos to @jchorl. The files in question were also bash scripts and our workload is also very heavy. I circumvented the issue in our case by copying the files from S3 (via mountpoint) to EBS such that I wouldn't have to read them from mountpoint during execution but this remains an open issue for us.

@dannycjones
Copy link
Contributor

Thanks again for the details on this report.
We're working on a fix for this and hope to share more details soon.


The regression was introduced in 0030b0a and it is indeed the flush triggered when closing one of the file descriptors that causes the issue.

We have two simple reproductions (thanks @jamesbornholt and @passaro):

One in Python...

import os
import sys

# open the file and duplicate the file descriptor
path = sys.argv[1]
fd = os.open(path, os.O_RDONLY)
fd2 = os.dup(fd)

# read from the first file descriptor
b = os.read(fd, 10)
print(f"read from fd: {b}")

# close the first file descriptor -- triggers a fuse FLUSH
os.close(fd)

# now try reading from the second file descriptor
os.lseek(fd2, 100000, os.SEEK_SET)
b = os.read(fd2, 10)
print(f"read from fd2: {b}")
os.close(fd2)

And the other in Rust as a new integration test to be added: https://github.com/passaro/mountpoint-s3/blob/17c245dbd527815be6659232af97765c7cee07ea/mountpoint-s3/tests/fuse_tests/read_test.rs#L313-L335

@dannycjones
Copy link
Contributor

dannycjones commented Feb 16, 2024

We've released v1.4.1 which contains a fix for this: https://github.com/awslabs/mountpoint-s3/releases/tag/mountpoint-s3-1.4.1

Please let us know @jchorl if you're still seeing the issue, otherwise I hope we can resolve it.

@jchorl
Copy link
Contributor Author

jchorl commented Feb 16, 2024

Thank you for the speedy reproduction and fix.

I just ran a test that was repeatedly failing and it passed, so I think we're all good here. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants