Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Nov 16, 2007
  1. @gregkh

    Linux 2.6.23.4

    gregkh authored
  2. @linvjw @gregkh

    mac80211: make ieee802_11_parse_elems return void

    linvjw authored gregkh committed
    patch 67a4cce in mainline.
    
    Some APs send management frames with junk padding after the last IE.
    We already account for a similar problem with some Apple Airport
    devices, but at least one device is known to send more than a single
    extra byte.  The device in question is the Draytek Vigor2900:
    
    	http://www.draytek.com.au/products/Vigor2900.php
    
    The junk in question looks like an IE that runs off the end of the
    frame.  This cause us to return ParseFailed.  Since the frame in
    question is an association response, this causes us to fail to associate
    with this AP.
    
    The return code from ieee802_11_parse_elems is superfluous.
    All callers still check for the presence of the specific IEs that
    interest them anyway.  So, remove the return code so the parse never
    "fails".
    
    Acked-by: Michael Wu <flamingice@sourmilk.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  3. @linvjw @gregkh

    mac80211: only honor IW_SCAN_THIS_ESSID in STA, IBSS, and AP modes

    linvjw authored gregkh committed
    patch d114f39 in mainline.
    
    The previous IW_SCAN_THIS_ESSID patch left a hole allowing scan
    requests on interfaces in inappropriate modes.
    
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  4. @gregkh

    mac80211: honor IW_SCAN_THIS_ESSID in siwscan ioctl

    Bill Moss authored gregkh committed
    patch 107acb2 in mainline.
    
    This patch fixes the problem of associating with wpa_secured hidden
    AP.  Please try out.
    
    The original author of this patch is Bill Moss <bmoss@clemson.edu>
    
    Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  5. @linvjw @gregkh

    mac80211: store SSID in sta_bss_list

    linvjw authored gregkh committed
    patch cffdd30 in mainline.
    
    Some AP equipment "in the wild" services multiple SSIDs using the
    same BSSID.  This patch changes the key of sta_bss_list to include
    the SSID as well as the BSSID and the channel so as to prevent one
    SSID from eclipsing another SSID with the same BSSID.
    
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  6. @linvjw @gregkh

    mac80211: store channel info in sta_bss_list

    linvjw authored gregkh committed
    patch 65c107a in mainline.
    
    Some AP equipment "in the wild" uses the same BSSID on multiple channels
    (particularly "a" vs. "b/g").  This patch changes the key of sta_bss_list
    to include both the BSSID and the channel so as to prevent a BSSID on
    one channel from eclipsing the same BSSID on another channel.
    
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  7. @jmberg @gregkh

    mac80211: reorder association debug output

    jmberg authored gregkh committed
    patch 1dd84aa in mainline.
    
    There's no reason to warn about an invalid AID field when the
    association was denied.
    
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Acked-by: Michael Wu <flamingice@sourmilk.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  8. @jmberg @gregkh

    ieee80211: fix TKIP QoS bug

    jmberg authored gregkh committed
    patch e797aa1 in mainline.
    
    The commit 65b6a27 titled "ieee80211: Fix header->qos_ctl endian issue"
    *introduced* an endianness bug. Partially revert it.
    
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  9. @gregkh

    NETFILTER: nf_conntrack_tcp: fix connection reopening

    Jozsef Kadlecsik authored gregkh committed
    Upstream commits: 1731139 + bc34b84 merged together.  Merge done by
    Patrick McHardy <kaber@trash.net>
    
    [NETFILTER]: nf_conntrack_tcp: fix connection reopening
    
    With your description I could reproduce the bug and actually you were
    completely right: the code above is incorrect. Somehow I was able to
    misread RFC1122 and mixed the roles :-(:
    
       When a connection is >>closed actively<<, it MUST linger in
       TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime).
       However, it MAY >>accept<< a new SYN from the remote TCP to
       reopen the connection directly from TIME-WAIT state, if it:
       [...]
    
    The fix is as follows: if the receiver initiated an active close, then the
    sender may reopen the connection - otherwise try to figure out if we hold
    a dead connection.
    
    Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
    Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
    Signed-off-by: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  10. @kaber @gregkh

    Fix netlink timeouts.

    kaber authored gregkh committed
    [NETLINK]: Fix unicast timeouts
    
    [ Upstream commit: c3d8d1e ]
    
    Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts
    by moving the schedule_timeout() call to a new function that doesn't
    propagate the remaining timeout back to the caller. This means on each
    retry we start with the full timeout again.
    
    ipc/mqueue.c seems to actually want to wait indefinitely so this
    behaviour is retained.
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  11. @herbertx @gregkh

    Fix crypto_alloc_comp() error checking.

    herbertx authored gregkh committed
    [IPSEC]: Fix crypto_alloc_comp error checking
    
    [ Upstream commit: 4999f36 ]
    
    The function crypto_alloc_comp returns an errno instead of NULL
    to indicate error.  So it needs to be tested with IS_ERR.
    
    This is based on a patch by Vicenç Beltran Querol.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  12. @kaber @gregkh

    Fix SET_VLAN_INGRESS_PRIORITY_CMD error return.

    kaber authored gregkh committed
    patch fffe470 in mainline.
    
    [VLAN]: Fix SET_VLAN_INGRESS_PRIORITY_CMD ioctl
    
    Based on report and patch by Doug Kehn <rdkehn@yahoo.com>:
    
    vconfig returns the following error when attempting to execute the
    set_ingress_map command:
    
    vconfig: socket or ioctl error for set_ingress_map: Operation not permitted
    
    In vlan.c, vlan_ioctl_handler for SET_VLAN_INGRESS_PRIORITY_CMD
    sets err = -EPERM and calls vlan_dev_set_ingress_priority.
    vlan_dev_set_ingress_priority is a void function so err remains
    at -EPERM and results in the vconfig error (even though the ingress
    map was set).
    
    Fix by setting err = 0 after the vlan_dev_set_ingress_priority call.
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  13. @kaber @gregkh

    Fix VLAN address syncing.

    kaber authored gregkh committed
    patch d932e04 in mainline.
    
    [PATCH] [VLAN]: Don't synchronize addresses while the vlan device is down
    
    While the VLAN device is down, the unicast addresses are not configured
    on the underlying device, so we shouldn't attempt to sync them.
    
    Noticed by Dmitry Butskoy <buc@odusz.so-cdu.ru>
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  14. @gregkh

    Fix endianness bug in U32 classifier.

    Radu Rendec authored gregkh committed
    changeset 543821c in mainline.
    
    [PKT_SCHED] CLS_U32: Fix endianness problem with u32 classifier hash masks.
    
    While trying to implement u32 hashes in my shaping machine I ran into
    a possible bug in the u32 hash/bucket computing algorithm
    (net/sched/cls_u32.c).
    
    The problem occurs only with hash masks that extend over the octet
    boundary, on little endian machines (where htonl() actually does
    something).
    
    Let's say that I would like to use 0x3fc0 as the hash mask. This means
    8 contiguous "1" bits starting at b6. With such a mask, the expected
    (and logical) behavior is to hash any address in, for instance,
    192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in
    bucket 1, then 192.168.0.128/26 in bucket 2 and so on.
    
    This is exactly what would happen on a big endian machine, but on
    little endian machines, what would actually happen with current
    implementation is 0x3fc0 being reversed (into 0xc03f0000) by htonl()
    in the userspace tool and then applied to 192.168.x.x in the u32
    classifier. When shifting right by 16 bits (rank of first "1" bit in
    the reversed mask) and applying the divisor mask (0xff for divisor
    256), what would actually remain is 0x3f applied on the "168" octet of
    the address.
    
    One could say is this can be easily worked around by taking endianness
    into account in userspace and supplying an appropriate mask (0xfc03)
    that would be turned into contiguous "1" bits when reversed
    (0x03fc0000). But the actual problem is the network address (inside
    the packet) not being converted to host order, but used as a
    host-order value when computing the bucket.
    
    Let's say the network address is written as n31 n30 ... n0, with n0
    being the least significant bit. When used directly (without any
    conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15
    etc in the machine's registers. Thus bits n7 and n8 would no longer be
    adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be
    consecutive.
    
    The fix is to apply ntohl() on the hmask before computing fshift,
    and in u32_hash_fold() convert the packet data to host order before
    shifting down by fshift.
    
    With helpful feedback from Jamal Hadi Salim and Jarek Poplawski.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  15. @gregkh

    Fix TEQL oops.

    Evgeniy Polyakov authored gregkh committed
    [PKT_SCHED]: Fix OOPS when removing devices from a teql queuing discipline
    
    [ Upstream commit: 4f9f831 ]
    
    tecl_reset() is called from deactivate and qdisc is set to noop already,
    but subsequent teql_xmit does not know about it and dereference private
    data as teql qdisc and thus oopses.
    not catch it first :)
    
    Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  16. @davem330 @gregkh

    Fix error returns in sys_socketpair()

    davem330 authored gregkh committed
    patch bf3c23d in mainline.
    
    [NET]: Fix error reporting in sys_socketpair().
    
    If either of the two sock_alloc_fd() calls fail, we
    forget to update 'err' and thus we'll erroneously
    return zero in these cases.
    
    Based upon a report and patch from Rich Paul, and
    commentary from Chuck Ebbert.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  17. @jmberg @gregkh

    softmac: fix wext MLME request reason code endianness

    jmberg authored gregkh committed
    patch 94e10bf in mainline.
    
    The MLME request reason code is host-endian and our passing
    it to the low level functions is host-endian as well since
    they do the swapping. I noticed that the reason code 768 was
    sent (0x300) rather than 3 when wpa_supplicant terminates.
    This removes the superfluous cpu_to_le16() call.
    
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  18. @abattersby @gregkh

    Fix kernel_accept() return handling.

    abattersby authored gregkh committed
    patch fa8705b in mainline.
    
    [NET]: sanitize kernel_accept() error path
    
    If kernel_accept() returns an error, it may pass back a pointer to
    freed memory (which the caller should ignore).  Make it pass back NULL
    instead for better safety.
    
    Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  19. @herbertx @gregkh

    TCP: Fix size calculation in sk_stream_alloc_pskb

    herbertx authored gregkh committed
    [TCP]: Fix size calculation in sk_stream_alloc_pskb
    
    [ Upstream commit: fb93134 ]
    
    We round up the header size in sk_stream_alloc_pskb so that
    TSO packets get zero tail room.  Unfortunately this rounding
    up is not coordinated with the select_size() function used by
    TCP to calculate the second parameter of sk_stream_alloc_pskb.
    
    As a result, we may allocate more than a page of data in the
    non-TSO case when exactly one page is desired.
    
    In fact, rounding up the head room is detrimental in the non-TSO
    case because it makes memory that would otherwise be available to
    the payload head room.  TSO doesn't need this either, all it wants
    is the guarantee that there is no tail room.
    
    So this patch fixes this by adjusting the skb_reserve call so that
    exactly the requested amount (which all callers have calculated in
    a precise way) is made available as tail room.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  20. @herbertx @gregkh

    Fix SKB_WITH_OVERHEAD calculations.

    herbertx authored gregkh committed
    patch deea84b in mainline.
    
    [NET]: Fix SKB_WITH_OVERHEAD calculation
    
    The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
    an overflow across a page boundary which is what it's meant to prevent.
    In particular, the header length (X) should not be lumped together with
    skb_shared_info.  The latter needs to be aligned properly while the header
    has no choice but to sit in front of wherever the payload is.
    
    Therefore the correct calculation is to take away the aligned size of
    skb_shared_info, and then subtract the header length.  The resulting
    quantity L satisfies the following inequality:
    
    	SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE
    
    This is the quantity used by alloc_skb to do the actual allocation.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  21. @gregkh

    Fix 9P protocol build

    Ingo Molnar authored gregkh committed
    patch 092e9d9 in mainline.
    
    [9P]: build fix with !CONFIG_SYSCTL
    
    found via make randconfig build testing:
    
     net/built-in.o: In function `init_p9':
     mod.c:(.init.text+0x3b39): undefined reference to `p9_sysctl_register'
     net/built-in.o: In function `exit_p9':
     mod.c:(.exit.text+0x36b): undefined reference to `p9_sysctl_unregister'
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  22. @kaber @gregkh

    Fix advertised packet scheduler timer resolution

    kaber authored gregkh committed
    patch 3c0cfc1 in mainline
    
    The fourth parameter of /proc/net/psched is supposed to show the timer
    resultion and is used by HTB userspace to calculate the necessary
    burst rate. Currently we show the clock resolution, which results in a
    too low burst rate when the two differ.
    
    Signed-off-by: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  23. @warmcat @gregkh

    Add get_unaligned to ieee80211_get_radiotap_len

    warmcat authored gregkh committed
    patch dfe6e81 in mainline.
    
    ieee80211_get_radiotap_len() tries to dereference radiotap length without
    taking care that it is completely unaligned and get_unaligned()
    is required.
    
    Signed-off-by: Andy Green <andy@warmcat.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  24. @warmcat @gregkh

    mac80211: Improve sanity checks on injected packets

    warmcat authored gregkh committed
    patch 9b8a74e in mainline.
    
    Michael Wu noticed that the skb length checking is not taken care of enough when
    a packet is presented on the Monitor interface for injection.
    
    This patch improves the sanity checking and removes fake offsets placed
    into the skb network and transport header.
    
    Signed-off-by: Andy Green <andy@warmcat.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  25. @linvjw @gregkh

    mac80211: filter locally-originated multicast frames

    linvjw authored gregkh committed
    patch b331615 in mainline.
    
    In STA mode, the AP will echo our traffic.  This includes multicast
    traffic.
    
    Receiving these frames confuses some protocols and applications,
    notably IPv6 Duplicate Address Detection.
    
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Acked-by: Michael Wu <flamingice@sourmilk.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  26. @gregkh

    Linux 2.6.23.3

    gregkh authored
  27. @torvalds @gregkh

    revert "x86_64: allocate sparsemem memmap above 4G"

    torvalds authored gregkh committed
    Reverted upstream by commit 6a22c57
    
    Revert this commit:
    
    	commit 2e1c49d
    	Author: Zou Nan hai <nanhai.zou@intel.com>
    	Date:   Fri Jun 1 00:46:28 2007 -0700
    	
    	x86_64: allocate sparsemem memmap above 4G
    
    This reverts commit 2e1c49d.
    
    First off, testing in Fedora has shown it to cause boot failures,
    bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
    commit will likely be reverted in the 2.6.23 stable kernels.
    
    Secondly, in the 2.6.24 model, x86-64 has now grown support for
    SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
    bug is not visible any more, it's become invisible due to the code just
    being irrelevant and no longer enabled on the only architecture that
    this ever affected.
    
    Reported-by: Dave Jones <davej@redhat.com>
    Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
    Cc: Zou Nan hai <nanhai.zou@intel.com>
    Cc: Suresh Siddha <suresh.b.siddha@intel.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Acked-by: Andy Whitcroft <apw@shadowen.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Chuck Ebbert <cebbert@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  28. @gregkh

    x86: fix TSC clock source calibration error

    Dave Johnson authored gregkh committed
    patch edaf420 in mainline.
    
    I ran into this problem on a system that was unable to obtain NTP sync
    because the clock was running very slow (over 10000ppm slow). ntpd had
    declared all of its peers 'reject' with 'peer_dist' reason.
    
    On investigation, the tsc_khz variable was significantly incorrect
    causing xtime to run slow.  After a reboot tsc_khz was correct so I
    did a reboot test to see how often the problem occurred:
    
    Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
    had unacceptable tsc_khz values (>500ppm):
    
     range of tsc_khz  # of boots  % of boots
     ----------------  ----------  ----------
            < 1999750           0      0.000%
    1999750 - 1999800          21      3.048%
    1999800 - 1999850         166     24.128%
    1999850 - 1999900         241     35.029%
    1999900 - 1999950         211     30.669%
    1999950 - 2000000          42      6.105%
    2000000 - 2000000           0      0.000%
    2000050 - 2000100           0      0.000%
                       [...]
    2000100 - 2015000           1      0.145%  << BAD
    2015000 - 2030000           6      0.872%  << BAD
    2030000 - 2045000           1      0.145%  << BAD
    2045000 <                   0      0.000%
    
    The worst boot was 2032.577 Mhz, over 1.5% off!
    
    It appears that on rare occasions, mach_countup() is taking longer to
    complete than necessary.
    
    I suspect that this is caused by the CPU taking a periodic SMI
    interrupt right at the end of the 30ms calibration loop.  This would
    cause the loop to delay while the SMI BIOS hander runs. The resulting
    TSC value is beyond what it actually should be resulting in a higher
    tsc_khz.
    
    The below patch makes native_calculate_cpu_khz() take the best
    (shortest duration, lowest khz) run of it's 3 calibration loops.  If a
    SMI goes off causing a bad result (long duration, higher khz) it will
    be discarded.
    
    With the patch applied, 300 boots of the same system produce good
    results:
    
     range of tsc_khz  # of boots  % of boots
     ----------------  ----------  ----------
            < 1999750           0      0.000%
    1999750 - 1999800          30     10.000%
    1999800 - 1999850         166     55.333%
    1999850 - 1999900          89     29.667%
    1999900 - 1999950          15      5.000%
    1999950 <                   0      0.000%
    
    Problem was found and tested against 2.6.18.  Patch is against 2.6.22.
    
    Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  29. @gregkh

    x86 setup: sizeof() is unsigned, unbreak comparisons

    H. Peter Anvin authored gregkh committed
    patch e6e1ace in mainline.
    
    
    We use signed values for limit checking since the values can go
    negative under certain circumstances.  However, sizeof() is unsigned
    and forces the comparison to be unsigned, so move the comparison into
    the heap_free() macros so we can ensure it is a signed comparison.
    
    Signed-off-by: H. Peter Anvin <hpa@zytor.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  30. @gregkh

    x86 setup: handle boot loaders which set up the stack incorrectly

    H. Peter Anvin authored gregkh committed
    patch 6b6815c in mainline.
    
    Apparently some specific versions of LILO enter the kernel with a
    stack pointer that doesn't match the rest of the segments.  Make our
    best attempt at untangling the resulting mess.
    
    Signed-off-by: H. Peter Anvin <hpa@zytor.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  31. @gregkh

    x86: fix global_flush_tlb() bug

    Ingo Molnar authored gregkh committed
    patch 9a24d04 upstream
    
    While we were reviewing pageattr_32/64.c for unification,
    Thomas Gleixner noticed the following serious SMP bug in
    global_flush_tlb():
    
    	down_read(&init_mm.mmap_sem);
    	list_replace_init(&deferred_pages, &l);
    	up_read(&init_mm.mmap_sem);
    
    this is SMP-unsafe because list_replace_init() done on two CPUs in
    parallel can corrupt the list.
    
    This bug has been introduced about a year ago in the 64-bit tree:
    
           commit ea7322d
           Author: Andi Kleen <ak@suse.de>
           Date:   Thu Dec 7 02:14:05 2006 +0100
    
           [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr
    
                    down_read(&init_mm.mmap_sem);
            -       dpage = xchg(&deferred_pages, NULL);
            +       list_replace_init(&deferred_pages, &l);
                    up_read(&init_mm.mmap_sem);
    
    the xchg() based version was SMP-safe, but list_replace_init() is not.
    So this "cleanup" introduced a nasty bug.
    
    why this bug never become prominent is a mystery - it can probably be
    explained with the (still) relative obscurity of the x86_64 architecture.
    
    the safe fix for now is to write-lock init_mm.mmap_sem.
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Andi Kleen <ak@suse.de>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  32. @jsgf @gregkh

    xfs: eagerly remove vmap mappings to avoid upsetting Xen

    jsgf authored gregkh committed
    patch ace2e92 in mainline.
    
    XFS leaves stray mappings around when it vmaps memory to make it
    virtually contigious.  This upsets Xen if one of those pages is being
    recycled into a pagetable, since it finds an extra writable mapping of
    the page.
    
    This patch solves the problem in a brute force way, by making XFS
    always eagerly unmap its mappings.
    
    [ Stable: This works around a bug in 2.6.23.  We may come up with a
    better solution for mainline, but this seems like a low-impact fix for
    the stable kernel. ]
    
    Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
    Cc: XFS masters <xfs-masters@oss.sgi.com>
    Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
    Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  33. @jsgf @gregkh

    xen: fix incorrect vcpu_register_vcpu_info hypercall argument

    jsgf authored gregkh committed
    patch e3d2697 in mainline.
    
    The kernel's copy of struct vcpu_register_vcpu_info was out of date,
    at best causing the hypercall to fail and the guest kernel to fall
    back to the old mechanism, or worse, causing random memory corruption.
    
    Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
    Cc: Stable Kernel <stable@kernel.org>
    Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
    Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  34. @jsgf @gregkh

    xen: deal with stale cr3 values when unpinning pagetables

    jsgf authored gregkh committed
    patch 9f79991 in mainline.
    
    When a pagetable is no longer in use, it must be unpinned so that its
    pages can be freed.  However, this is only possible if there are no
    stray uses of the pagetable.  The code currently deals with all the
    usual cases, but there's a rare case where a vcpu is changing cr3, but
    is doing so lazily, and the change hasn't actually happened by the time
    the pagetable is unpinned, even though it appears to have been completed.
    
    This change adds a second per-cpu cr3 variable - xen_current_cr3 -
    which tracks the actual state of the vcpu cr3.  It is only updated once
    the actual hypercall to set cr3 has been completed.  Other processors
    wishing to unpin a pagetable can check other vcpu's xen_current_cr3
    values to see if any cross-cpu IPIs are needed to clean things up.
    
    Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
  35. @jsgf @gregkh

    xen: add batch completion callbacks

    jsgf authored gregkh committed
    patch 91e0c5f in mainline.
    
    This adds a mechanism to register a callback function to be called once
    a batch of hypercalls has been issued.  This is typically used to unlock
    things which must remain locked until the hypercall has taken place.
    
    Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Something went wrong with that request. Please try again.