Skip to content
Branch: master
Commits on Jul 19, 2018
  1. Allow bigger filenames

    torvalds committed Jul 19, 2018
    Not that I know why. 80-character filename limits are kind of cute.
    Joerg suggested _POSIX_PATH_MAX, which is almost certainly the right
    value.  But that's bigger than HUGE:
       #define HUGE    1000           /* Huge number                  */
    which obviously means we shouldn't go quite *that* extreme.
    Remember, we come from the days when it was hard to do allocations
    larger than 64kB.   We have limits, dammit.
    "256 bytes is enough for anybody"
    Reported-by: Joerg Scheurich <>
    Signed-off-by: Linus Torvalds <>
  2. Use _GNU_SOURCE instead of _BSD_SOURCE and _SYSV_SOURCE

    torvalds committed Jul 19, 2018
    uemacs uses a lot of legacy stuff, which causes warnings with newer
    This makes it build reasonably warning-free.
    Signed-off-by: Linus Torvalds <>
  3. Do some valgrind cleanup

    torvalds committed Jul 19, 2018
    Joerg Scheurich reported that there's a buffer overflow in readin() for
    long path component names. He's not wrong.
    When fixing that, and then checking there's nothing else obviously wrong
    with valgrind, I also noticed it complains about overlapping strcpy().
    So add a hacky version of strscpy(), which (a) handles overlapping, and
    (b) has the proper strscpy() semantics.
    Just say no to strncpy and strlcpy, both of which are terminally broken
    And stop stripping the binary.  The time when the size of the uemacs
    binary was a big deal is long past, and it made valgrind harder.
    Reported-by: Joerg Scheurich <>
    Signed-off-by: Linus Torvalds <>
Commits on Jan 17, 2018
  1. Merge branch 'experimental'

    torvalds committed Jan 17, 2018
    I'm not entirely happy with the new paragraph heuristics, but I've been
    using them for a while and have grown used to them and have grown to
    rely on the behavior.  Since there probably aren't all that many other
    people who use uemacs, let's just merge that behavior and see if anybody
    else even notices.
    The "NBSP to SP on keyboard input" is similarly convenient to me, and
    might be offensive to others.  Let's see.
    * experimental:
      Turn NBSP into regular SP on input
      Try updated rule for "is new paragraph"
      Make some minor code legibility changes
Commits on Mar 18, 2017
  1. Turn NBSP into regular SP on input

    torvalds committed Mar 18, 2017
    Particularly pasting from a web browser, I get a lot of 'space' +
    'non-breaking space' noise, and keeping the &nbsp as an actual unicode
    character ends up being a major pain.
    Note: this is only done on input.  If the file contains the unicode
    character U+00A0, we'll keep it that way.  But you can't enter it from
    the keyboard (or cut-and-paste, which ends up looking like keyboard
    Signed-off-by: Linus Torvalds <>
Commits on Oct 2, 2016
  1. Try updated rule for "is new paragraph"

    torvalds committed Oct 2, 2016
    This makes non-alphabetic characters at the beginning be a mark of a
    paragraph.  That's probably bogus, but let's try how it works.
    Signed-off-by: Linus Torvalds <>
  2. Make some minor code legibility changes

    torvalds committed Oct 2, 2016
    I'm going to play around with the whole "paragraph ends here" logic, but
    the way it used to be written made that hard.
    Signed-off-by: Linus Torvalds <>
Commits on Dec 8, 2014
  1. Don't use 'char' for number of lines

    torvalds committed Dec 8, 2014
    Heh.  My new UHD monitor makes it easy to have more than 127 lines of
    text.  I guess the 'char' could be an unsigned char, but quite frankly,
    trying to save a couple of bytes per open editor window seems a bit
    excessive these days.  So just make it 'int'.
    Signed-off-by: Linus Torvalds <>
Commits on Feb 22, 2013
  1. Stop using 'short' for line and allocation sizes

    torvalds committed Feb 22, 2013
    Yes, yes, it probably made sense 30 years ago as a way to save a tiny
    amount of memory, but especially when interspersed in structures that
    have pointers (aligned to 64 bits these days), it's not even saving
    memory today.  And it makes us fail in nasty ways when looking at files
    with long lines.
    So just make them 'int'.  And if you have a line that is longer than
    2GB, you only have yourself to blame.  I no longer care.
    In case anybody care, the "test-case" for this was a lovely UDDF file
    with a binary divecomputer dump encoded as an XML element.  Resulting in
    a lovely 41kB single line.  Not what poor micro-emacs was designed for,
    I'm afraid.
    I really should just learn another editor, rather than continue to
    polish this turd.
    Signed-off-by: Linus Torvalds <>
  2. Avoid memory access errors if llength() overflows

    torvalds committed Feb 22, 2013
    llength() is currently a 'short' which can overflow and result in signed
    numbers if line lengths are larger than 32k.  We'll fix the overflow
    separately, but before we do that, just use a signed int to hold the
    value so that we don't overrun memory allocations when we converted that
    negative number to a large positive unsigned integer.
    Signed-off-by: Linus Torvalds <>
Commits on Sep 25, 2012
  1. Fix the unicode character limit (0 .. 0x10ffff)

    torvalds committed Sep 25, 2012
    For some reason I had limited things to 0xffff, it really should be 0x10ffff.
    We don't actually support a full 32-bit unicode model anyway, since we
    use the high bits for the control/meta/^X/special bits, but there was no
    reason to limit things to 16 bits when we had 28 bits available.  And
    the real limit for real Unicode characters is 0x10ffff.
    Add a silly example character past the 16-bit range to the UTF8 demo
    from the 'emoticons' block.
    Signed-off-by: Linus Torvalds <>
Commits on Aug 16, 2012
  1. uemacs: Remove unused 'lflag' variables from file.c

    penberg authored and torvalds committed Aug 15, 2012
    GCC spotted the following unused variable:
        CC       file.o
      file.c: In function ‘readin’:
      file.c:225:6: warning: variable ‘lflag’ set but not used [-Wunused-but-set-variable]
      file.c: In function ‘ifile’:
      file.c:553:6: warning: variable ‘lflag’ set but not used [-Wunused-but-set-variable]
    Signed-off-by: Pekka Enberg <>
    Signed-off-by: Linus Torvalds <>
Commits on Jul 15, 2012
  1. Fix 'getccol()' and 'getgoal()' functions for multibyte UTF-8 characters

    torvalds committed Jul 15, 2012
    These functions convert the byte offset into the column number
    (getccol()) and vice versa (getgoal()).
    Getting this right means that moving up and down the text gets us the
    right columns, rather than moving randomly left and right when you move
    up and down.  We also won't end up in the middle of a utf-8 character,
    because we're not just moving into some random byte offset, we're moving
    into a proper column.
    Signed-off-by: Linus Torvalds <>
Commits on Jul 11, 2012
  1. Fix vtputc() and simplify show_line by using it again

    torvalds committed Jul 11, 2012
    This re-introduces vtputc() as the way to show characters, which
    reinstates the control character handing, and simplifies show_line() in
    the process.
    vtputc now takes an "int" that is either a unicode character or a signed
    char (so negative values in the range [-1, -128] are considered to be
    the same as [128, 255]).  This allows us to use it regardless of what
    the source of data is.
    Signed-off-by: Linus Torvalds <>
  2. Start doing character removal properly

    torvalds committed Jul 11, 2012
    This makes actual basic editing work.  Including things like
    justify-paragraph etc, so lines get justified by number of UTF8
    characters rather than bytes.
    There are probably tons of broken stuff left, but this actually seems to
    get the basics working right.
    Signed-off-by: Linus Torvalds <>
  3. Start actually inserting full utf8 sequences

    torvalds committed Jul 11, 2012
    This makes it possible to cut-and-paste the UTF8 testfile into a new
    buffer, and the end result looks correct.
    NOTE! We still do various things wrong while editing.  For example,
    while the cursor movements were fixed, simple things like deleting a
    character still work on single bytes, rather than utf8 characters.
    So while this is getting much closer to actually editing UTF-8 data,
    it's not there yet.
    Signed-off-by: Linus Torvalds <>
  4. Make 'show_line()' do proper TAB handling

    torvalds committed Jul 11, 2012
    The TAB handling got broken by commit cee00b0 ("Show UTF-8 input as
    UTF-8 output") when it stopped doing things one byte at a time.
    I'm sure the other special character cases are broken too.
    Signed-off-by: Linus Torvalds <>
  5. Expand keycode to 'int' from 'short'

    torvalds committed Jul 11, 2012
    This uses the four high bits for the meta and control key sequences.
    This means that we will be limiting our Unicode space to 28 bits, but
    that's more than we really need.
    It *would* be nicer if we just used the sign bit to mark "we have meta
    character information") but that would require bigger changes.  And we
    really don't need to worry about 30-bit unicode.  Small steps, remember.
    Signed-off-by: Linus Torvalds <>
  6. character input: make sure we have enough bytes for a full utf8 chara…

    torvalds committed Jul 11, 2012
    .. but we do have that 0.1s delay, so if somebody feeds us non-utf8
    sequences, we won't delay forever.
    Signed-off-by: Linus Torvalds <>
  7. utf8: make sure to honor the array length properly

    torvalds committed Jul 11, 2012
    Right now the input side can give partial utf8 input, and that showed
    that we didn't properly handle that case.
    Signed-off-by: Linus Torvalds <>
  8. Make kbd macro save area use 'int' instead of short

    torvalds committed Jul 11, 2012
    I'm starting to expand the input value from 'short' (with flags in the
    upper eight bytes) to 'int' (with negative values having flags).
    Small baby steps.
    Signed-off-by: Linus Torvalds <>
  9. Use utf8 helper functions for keyboard input

    torvalds committed Jul 11, 2012
    ttgetc() used some homebrew utf8 to unicode translation, limited to just
    the normal latin1 characters.  Use the utf8 helper functions to get it
    right for the more complex cases.
    NOTE! We don't actually handle characters > 0xff right anyway.  And we
    still end up doing Latin1 in the buffers on input.  One small step at a
    Signed-off-by: Linus Torvalds <>
Commits on Jul 10, 2012
  1. Make cursor movement (largely) understand UTF-8 character boundaries

    torvalds committed Jul 10, 2012
    Ok, so it may do odd things if it's not truly utf-8, and when moving up
    and down lines that have utf-8 the cursor moves oddly (because the byte
    offset within the line stays constant, rather than the character
    offset), but with this you can actually open the UTF8 example file and
    move around it, and at least some of the movement makes sense.
    Signed-off-by: Linus Torvalds <>
  2. Split up the utf8 helper functions into a file of their own

    torvalds committed Jul 10, 2012
    Signed-off-by: Linus Torvalds <>
  3. Remove the old utf8_mode thing.

    torvalds committed Jul 10, 2012
    Let's just plan on being fully utf8 some day.  We're not there yet, and
    maybe we'll never be, but having the halfway mode is not useful either.
    Signed-off-by: Linus Torvalds <>
  4. Show UTF-8 input as UTF-8 output

    torvalds committed Jul 10, 2012
    .. by doing the stupid "convert to unicode value and back" model.
    This actually populates the 'struct video' array with the unicode
    values, so UTF8 input actually shows correctly.  In particular, the nice
    test-file (UTF-8-demo.txt) shows up not as garbage, but as the UTF-8 it
    Since the *editing* doesn't know about UTF-8, and considers it just a
    stream of bytes, the end result is not actually a usable utf-8 editor.
    So don't get too excited yet: this is just a partial step to "actually
    edit utf8 data"
    NOTE NOTE NOTE! If the character buffer contains Latin1, we will
    transform that Latin1 to unicode, and then output it as UTF8.  And we
    will edit it correctly as the character-by-character data.  Also, we
    still do the "UTF8 to Latin1" translation on *input*, so with this
    commit we can actually continue to *edit* Latin1 text.
    Signed-off-by: Linus Torvalds <>
  5. Make the 'struct video' contain an array of unicode characters rather…

    torvalds committed Jul 10, 2012
    … than bytes
    This is disgusting.  And quite frankly, it's debatable whether this will
    ever work.  The "line" structure is still just an array of characters,
    so that has to work with utf-8.
    But the 'struct video' thing is what represents the actual screen
    rectangle, and is fixed-size by the size of the screen.  So making it
    contain actual 32-bit unicode characters *may* make sense.
    Right now we translate things the same way we always used to, though, so
    utf-8 in 'struct line' will not be translated to the proper unicode
    array, but to the bytes of the utf-8 representation.  So this really
    doesn't improve anything per se yet, just expands the memory use of the
    video array.
    Signed-off-by: Linus Torvalds <>
  6. Show lines with a single helper function, not one byte at a time

    torvalds committed Jul 10, 2012
    Let's see how hard it is to show UTF-8 characters properly.
    Signed-off-by: Linus Torvalds <>
Commits on May 26, 2012
  1. Make uemacs build on FreeBSD.

    naota authored and torvalds committed May 26, 2012
    See <>.
    Signed-off-by: Ulrich Müller <>
    Signed-off-by: Linus Torvalds <>
Commits on Aug 25, 2011
  1. spawn.c: do the "keyboard open/close" around shell invocations

    torvalds committed Aug 25, 2011
    I'm not 100% sure we really should even be doing this whole "keyboard"
    open/close for termcap, but even if the right thing to do ends up being
    to just do everything in the TTopen/TTclose (and make TTkopen/TTkclose
    no-ops), it does seem to be the right thing to do.
    Reported-by: Bijan Soleymani <>
    Signed-off-by: Linus Torvalds <>
  2. file.c: remove crazy keyboard open/close calls

    torvalds committed Aug 25, 2011
    It seems to have something to do with some old DOS mode, and not having
    keyboard translation on ("Insert floppy A:" questions while opening
    files? Whatever).  But this is while doing normal file opens, and it is
    just insane to open/close a tty across a file open.
    The possible tty init/exit sequence would mess up some of the file
    read/write messages.
    Reported-by: Bijan Soleymani <>
    Signed-off-by: Linus Torvalds <>
  3. Force a screen re-draw after tcap 'ti' on terminal open

    torvalds committed Aug 25, 2011
    The 'tcapkopen()' function re-initializes the terminal with the 'ti'
    sequence, which for most sane termcap entries is just empty.  But for
    'xterm', that seems to actually be a real control sequence (clear and
    reset?), and we'd better tell display.c that the screen is now garbage
    and needs to be re-drawn.
    Also, make tcapkclose() match the 'ti' (terminal init) with a 'te'
    (terminal exit).
    Maybe we should just stop playing games with ti/te, but this at least
    improves the situation a bit.
    Reported-by: Bijan Soleymani <>
    Signed-off-by: Linus Torvalds <>
Commits on Aug 22, 2011
  1. uemacs: Add -g options to the output usage.

    tfarina authored and torvalds committed Apr 23, 2011
    While I'm here, improve the word of the above two options.
    Signed-off-by: Thiago Farina <>
    Signed-off-by: Linus Torvalds <>
  2. Respect LDFLAGS when linking.

    ulm authored and torvalds committed Aug 21, 2011
    Signed-off-by: Ulrich Müller <>
    Signed-off-by: Linus Torvalds <>
  3. Show xA0 (nbsp) as a non-printable character

    torvalds committed Aug 22, 2011
    I want to see the difference between space and nbsp, and I consider nbsp
    to be a control character, so show it as such.  Even if it is
    technically "printable".
    Signed-off-by: Linus Torvalds <>
You can’t perform that action at this time.