Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE doesn't call os.fsync() #80988

Closed
gvanrossum opened this issue May 5, 2019 · 19 comments
Closed

IDLE doesn't call os.fsync() #80988

gvanrossum opened this issue May 5, 2019 · 19 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life topic-IDLE type-bug An unexpected behavior, bug, or error

Comments

@gvanrossum
Copy link
Member

BPO 36807
Nosy @gvanrossum, @terryjreedy, @dhalbert, @serhiy-storchaka
PRs
  • bpo-36807: When saving a file in IDLE, call flush and fsync #13102
  • [3.7] bpo-36807: When saving a file in IDLE, call flush and fsync (GH-13102) #13280
  • bpo-36807: Fix typo in NEWS item about IDLE (os.flush() should be os.fsync()) #13284
  • [3.7] bpo-36807: When saving a file in IDLE, call flush and fsync (GH-13102) #13288
  • [2.7] bpo-36807: When saving a file in IDLE, call flush and fsync (GH… #13293
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/terryjreedy'
    closed_at = <Date 2019-05-13.23:47:53.669>
    created_at = <Date 2019-05-05.20:22:26.300>
    labels = ['3.8', 'expert-IDLE', 'type-bug', '3.7']
    title = "IDLE doesn't call os.fsync()"
    updated_at = <Date 2019-05-13.23:52:32.100>
    user = 'https://github.com/gvanrossum'

    bugs.python.org fields:

    activity = <Date 2019-05-13.23:52:32.100>
    actor = 'gvanrossum'
    assignee = 'terry.reedy'
    closed = True
    closed_date = <Date 2019-05-13.23:47:53.669>
    closer = 'terry.reedy'
    components = ['IDLE']
    creation = <Date 2019-05-05.20:22:26.300>
    creator = 'gvanrossum'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36807
    keywords = ['patch']
    message_count = 19.0
    messages = ['341474', '341480', '341486', '341487', '341490', '341491', '341494', '341680', '341850', '341851', '342274', '342275', '342313', '342353', '342360', '342366', '342397', '342400', '342401']
    nosy_count = 4.0
    nosy_names = ['gvanrossum', 'terry.reedy', 'dhalbert', 'serhiy.storchaka']
    pr_nums = ['13102', '13280', '13284', '13288', '13293']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36807'
    versions = ['Python 2.7', 'Python 3.7', 'Python 3.8']

    @gvanrossum
    Copy link
    Member Author

    This came up during today's final PyCon keynote -- IDLE was called out
    as one of the two editors *not* to use when live-coding on Adafruit's
    Circuit Playground Express (https://www.adafruit.com/product/3333).

    I *think* that the problem referred to is that IDLE doesn't guarantee
    that the bits are actually flushed to disk -- they may linger in the
    OS filesystem cache.

    @gvanrossum gvanrossum added 3.7 (EOL) end of life 3.8 (EOL) end of life labels May 5, 2019
    @gvanrossum gvanrossum added topic-IDLE type-bug An unexpected behavior, bug, or error labels May 5, 2019
    @gvanrossum
    Copy link
    Member Author

    If/when you accept this we should also backport it as far as we can.

    @terryjreedy
    Copy link
    Member

    This seems like a good idea. I am assuming that open('somename', ...) always result in a file object with .fileno defined.

    Although I am mostly not touching 2.7, this should be easy to do. It will have to be done manually since iomenu was then IOBinding.

    @terryjreedy
    Copy link
    Member

    OS and disk interaction is not something I know a lot about. I don't know how long different OSes take to write things out by themselves, and therefore how much real danger there is of loosing data. However, IDLE is used in places where power is less reliable than it is for me, and if not doing this makes IDLE look bad, and if it does not noticeably delay running a file (and I expect not), then it seems like a good idea.

    Digging further, Kurt Kaiser added f.flush in 3/19/2006. On 2013-08-04, for bpo-18151, Serhiy submitted a patch for 3.3 with the comment "Here is a patch which replace the "open ... close" idiom to the "with open" idiom in IDLE." It replaced
    f = open(filename, "wb")
    f.write(chars)
    f.flush()
    f.close()
    with
    with open(filename, "wb") as f:
    f.write(chars)

    I have no idea why f.flush was deleted. An oversight? There is no discussion on the issue. I merged Serhiy's patch and backported to 2.7.

    @terryjreedy
    Copy link
    Member

    The io doc says for IOBase flush()
    Flush the write buffers of the stream if applicable. This does nothing for read-only and non-blocking streams.

    and for BufferedWriter flush()
    Force bytes held in the buffer into the raw stream. A BlockingIOError should be raised if the raw stream blocks.

    On 3.x, open(filename, "wb"), used in writefile(), returns a BufferedWriter. So it seems than an exception is possible, which would crash IDLE without try-except.

    Serhiy, please read the previous message(s). Do you remember if you intended to remove the f.flush in writefile(), which Guido proposes to restore?

    @serhiy-storchaka
    Copy link
    Member

    f.flush() was removed because it is redundant if followed by f.close(). f.close() calls f.flush() internally. f.close() is now called implicitly in __exit__() when exit from the with statement.

    Guido restores f.flush() because he needs to flush the buffer of the buffered stream before calling os.fsync(f.fileno()). And f.fileno() can be used only for not closed file. So this change looks reasonable to me.

    I do not know what are the troubles with Adafruit's
    Circuit Playground Express. I think that flushing the OS filesystem cache is the work of the OS itself, and lingering the data in the OS-wide cache should not affect userspace. But maybe there are specific circumstences, for example if IDLE and the consumer of Python source files are ran on different machines.

    @gvanrossum
    Copy link
    Member Author

    That board implements a USB filesystem that you plug into a computer (via a cable). The control software on the board watches this USB filesystem, and when the "code.py" file changes it reloads that file and executes it with CircuitPython (Adafruit's fork of MicroPython).

    So the recommended workflow for the user is: edit the code to program a LED blinking pattern (for example); save the file; watch the LEDs on the board blink in the programmed pattern. Repeat with other changes. Endless fun.

    Where IDLE currently doesn't cooperate is that after a save operation, the OS kernel doesn't immediately flush the bytes to the USB filesystem, so the board doesn't see the changes to the file until the OS cache gets flushed at the kernel's whim. (The board uses very low level fs operations because it's an 8-bit microprocessor with very little memory and therefore has very primitive software.)

    The os.fsync() call ought to fix it by forcing the kernel to flush the bytes to the USB filesystem right away. This also helps when the user saves the file and immediately pulls out the USB cable. The OS will issue a warning in an attempt to train users to "Eject" first, but beginners are prone to making this mistake repeatedly.

    Hope this helps. You should really watch @nnja's keynote once it goes online, it was really cool. :-)

    @dhalbert
    Copy link
    Mannequin

    dhalbert mannequin commented May 7, 2019

    I'm one of the CircuitPython core devs. This issue is OS-dependent: Windows and Linux don't necessarily write data and metadata out to USB drives promptly. The problem is particularly acute for FAT12 filesystems on Windows, which are typically 16MB or smaller: Windows can take up to 90 seconds to flush, due to a driver bug or perhaps some ancient concerns about floppy drives. MacOS doesn't have this issue.

    See https://superuser.com/questions/1197897/windows-delays-writing-fat-table-on-small-usb-drive-despite-quick-removal/1203781 for details. We have contacted Microsoft about this but getting it fixed is a long-term and uncertain process.

    I'll test on Windows when I return from PyCon. Thank you for the prompt fix!

    @terryjreedy
    Copy link
    Member

    I watched the talk by Nina Zakharenko at https://www.youtube.com/watch?v=35mXD40SvXM. Since I love being able to save, compile, and run a file with one keystroke, I appreciate people wanting to do the equivalent with CircuitExpress and Adafruit. In IDLE, print() output would appear in IDLE's Shell unless sys.stdout were replaced.

    Dan's note suggests that flush/fsynch would be more generally useful than for the board. I am inclined to add try: except: around the calls, and perhaps display a message "File was not saved" + traceback.

    Dan, slightly OT, but I am curious whether one can access USB ports (in a system-dependent manner) directly from python code via os.system and ctypes?

    @dhalbert
    Copy link
    Mannequin

    dhalbert mannequin commented May 8, 2019

    Dan, slightly OT, but I am curious whether one can access USB ports (in a system-dependent manner) directly from python code via os.system and ctypes?

    Do you mean from CircuitPython? The USB impplementation provides HID keyboard, mouse, and gamepad devices, CDC (serial), MIDI, and MSC (CIRCUITPY). We don't provide ctypes (though MicroPython does), but there are native modules that provide access to these devices. We will are planning to add user-defined USB descriptors.

    Also, we provide some enhancements in a CircuitPython-specific module to peek for input on CDC, so you can know when to call input(), without using the usual OS-dependent Python tricks for this.

    Right now REPL input/output is mixed with user input, as you mention, but we're considering adding a second serial CDC port so you'll be able to have a second serial channel and avoid colliding with the REPL.

    I'd be happy to discuss this kind of stuff with you further, by email or whatever other mechanism you have in mind, like some appropriate python mailing list or our own chat servers, etc.

    @dhalbert
    Copy link
    Mannequin

    dhalbert mannequin commented May 13, 2019

    Fix tested and works! Comment from PR duplicated here.

    I tested this fix by editing the 3.7.3 IDLE code by hand, and editing this test program as code.py on a CIRCUITPY drive on Windows 10:

    import time
    i = 0
    while True:
        print(i)
        i += 1
        print("""\
    123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
    123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
    """)
        time.sleep(1)

    I typically test write flushing by editing a short program like this, and duplicating the print lines so that the size of the program grows by >512 bytes. This forces a rewrite of the FAT12 information because the program has increased in size by one or more filesystem blocks. Then I shrink it back down again. If file flushing doesn't happen immediately, the program will often throw a syntax error (because it's missing some blocks), or CircuitPython will get an I/O error, which is what happened before I patched IDLE.

    So this looks good!

    @gvanrossum
    Copy link
    Member Author

    @terry please see my comment on the PR.

    @terryjreedy
    Copy link
    Member

    New changeset 4f098b3 by Terry Jan Reedy (Guido van Rossum) in branch 'master':
    bpo-36807: When saving a file in IDLE, call flush and fsync (bpo-13102)
    4f098b3

    @terryjreedy
    Copy link
    Member

    When 3.x is done, I will do 2.7.

    @terryjreedy
    Copy link
    Member

    New changeset 68a11b6 by Terry Jan Reedy (Miss Islington (bot)) in branch '3.7':
    [3.7] bpo-36807: When saving a file in IDLE, call flush and fsync (GH-13102) (bpo-13288)
    68a11b6

    @terryjreedy
    Copy link
    Member

    PR 13102 is original patch for master; it was merged, not closed.
    PR 13284 fixed blurb for master.
    PR 13280 backport for 3.7 was closed trying to restart bogus failure; can't do that for backports.
    PR 13288 is second backport; only optional Azure Pipilines failed.
    PR 13293 is manual 2.7 backport.

    @terryjreedy
    Copy link
    Member

    New changeset 353f8d2 by Terry Jan Reedy in branch '2.7':
    [2.7] bpo-36807: When saving a file in IDLE, call flush and fsync (GH-13102) (GH-13293)
    353f8d2

    @terryjreedy
    Copy link
    Member

    Do you mean from CircuitPython?

    No, I was asking about regular Python on Windows. I was not clear to me whether any of the software that Nina ran (other than possibly Adafruit USB drivers) was from Adafruit. I am now guessing that is was all 3rd party stuff and that IDLE will now be an alternative for part of what she used.

    I was in part wondering whether there were other simple (and allowable) changes to IDLE that might make it more useful for this type of application, but I will leave that sit for now.

    I answered at least some of my questions about USB and the Adafruit hardware and the rest of what you wrote with my search bar.

    @gvanrossum
    Copy link
    Member Author

    Thanks all!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 (EOL) end of life topic-IDLE type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants