Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add shadow propagation through memory-to-memory moves that use xmm registers #1453

Closed
derekbruening opened this issue Nov 28, 2014 · 11 comments

Comments

@derekbruening
Copy link
Contributor

From bruen...@google.com on February 25, 2014 10:35:03

Split from issue #243 which covers the general solution of shadow propagation through xmm registers and operations.

Original issue: http://code.google.com/p/drmemory/issues/detail?id=1453

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 25, 2014 07:39:57

Quoting from issue #243 :

"It turns out that most (if not all) of these are simply memory-to-memory moves that use the xmm registers. This is very similar to many floating-point cases ( issue #471 ) and may be amenable to a similar solution as a workaround until we implement full xmm shadowing, which I split as issue #1453 . Full shadowing will require mirroring all the crazy data movements within these registers performed by all the many sets of SSE_, AVX_, etc. instruction sets and will be non-trivial."

Here are some typical VS2013 uninits: http://build.chromium.org/p/chromium.memory.fyi/builders/Windows%20Unit%20%28DrMemory%20full%29%20%282%29/builds/486/steps/memory%20test%3A%20media/logs/stdio Error #1: UNINITIALIZED READ: reading 0x006ceff4-0x006ceff8 4 byte(s) within 0x006ceff0-0x006cf000

0 base.dll!std::deque<>::emplace_back<> [e:\b\depot_tools\win_toolchain\vs2013_files\vc\include\deque:1161]

1 base.dll!base::internal::IncomingTaskQueue::PostPendingTask [base\message_loop\incoming_task_queue.cc:138]

2 base.dll!base::internal::IncomingTaskQueue::AddToIncomingQueue [base\message_loop\incoming_task_queue.cc:28]

3 base.dll!base::internal::MessageLoopProxyImpl::PostDelayedTask [base\message_loop\message_loop_proxy_impl.cc:26]

4 base.dll!base::TaskRunner::PostTask [base\task_runner.cc:45]

5 media.dll!media::AudioManagerWin::AudioManagerWin [media\audio\win\audio_manager_win.cc:146]

6 media.dll!media::CreateAudioManager [media\audio\win\audio_manager_win.cc:513]

7 media.dll!media::AudioManager::Create [media\audio\audio_manager.cc:32]

8 media.dll!media::AudioManager::CreateForTesting [media\audio\audio_manager.cc:40]

9 testing::internal::HandleExceptionsInMethodIfSupported<> [testing\gtest\src\gtest.cc:2045]

Note: @0:00:06.475 in thread 4088
Note: instruction: movdqu (%ebx) -> %xmm0 http://build.chromium.org/p/chromium.memory.fyi/builders/Windows%20Unit%20%28DrMemory%20full%29%20%281%29/builds/486/steps/memory%20test%3A%20crypto/logs/stdio Error #1: UNINITIALIZED READ: reading 0x0228a834-0x0228a838 4 byte(s) within 0x0228a834-0x0228a83c

0 CRNSS.dll!SEC_QuickDERDecodeItem_Util [third_party\nss\nss\lib\util\quickder.c:886]

1 CRNSS.dll!sftk_GetPubKey [third_party\nss\nss\lib\softoken\pkcs11.c:1759]

2 CRNSS.dll!sftk_handlePublicKeyObject [third_party\nss\nss\lib\softoken\pkcs11.c:967]

3 CRNSS.dll!sftk_handleKeyObject [third_party\nss\nss\lib\softoken\pkcs11.c:1352]

4 CRNSS.dll!sftk_handleObject [third_party\nss\nss\lib\softoken\pkcs11.c:1606]

5 CRNSS.dll!NSC_GenerateKeyPair [third_party\nss\nss\lib\softoken\pkcs11c.c:4914]

6 CRNSS.dll!PK11_GenerateKeyPairWithOpFlags [third_party\nss\nss\lib\pk11wrap\pk11akey.c:1388]

7 CRNSS.dll!PK11_GenerateKeyPairWithFlags [third_party\nss\nss\lib\pk11wrap\pk11akey.c:1471]

8 CRNSS.dll!PK11_GenerateKeyPair [third_party\nss\nss\lib\pk11wrap\pk11akey.c:1495]

9 crcrypto.dll!crypto::ECPrivateKey::CreateWithParams [crypto\ec_private_key_nss.cc:313]

#10 crcrypto.dll!crypto::ECPrivateKey::Create [crypto\ec_private_key_nss.cc:96]
#11 ECPrivateKeyUnitTest_InitRandomTest_Test::TestBody [crypto\ec_private_key_unittest.cc:23]
#12 testing::internal::HandleExceptionsInMethodIfSupported<> [testing\gtest\src\gtest.cc:2045]
Note: @0:00:03.557 in thread 1892
Note: instruction: movq (%eax) -> %xmm0

*** TODO breakdown of most frequent errors

I grabbed some stdio results from the bots:

l
total 272348
2592 base-failures-stdio.txt 37552 media-failures-stdio.txt 14844 remoting-failures-stdio.txt
1072 crypto-failures-stdio.txt 134272 net1-failures-stdio.txt 82016 unit1-failures-stdio.txt

grep -h 'Note: instruction:' * | sed 's/^.*instruction: //;s/</span>//' | wc

67972 271913 2003443

grep -h 'Note: instruction:' * | sed 's/^.*instruction: //;s/</span>//' | sort | uniq -c | sort -n

...
640 movdqu (%ecx) -> %xmm0
670 movq 0x20(%eax) -> %xmm0
688 movq (%ecx) -> %xmm0
885 movdqu 0xffffffc4(%ebp) -> %xmm0
930 movdqu 0x10(%eax) -> %xmm0
946 movdqu (%eax) -> %xmm0
995 movq 0x10(%eax) -> %xmm0
1031 movq 0xffffffd8(%ebp) -> %xmm0
1696 movdqu 0x10(%ecx) -> %xmm0
2148 movdqu (%edi) -> %xmm0
2587 movq (%eax) -> %xmm0
4095 movq 0xffffffcc(%ebp) -> %xmm0
5657 movdqu 0x0000008c(%edi) -> %xmm0
39837 movdqu (%ebx) -> %xmm0

grep -h 'Note: instruction:' * | sed 's/^.*instruction: //;s/</span>//' | sort | uniq -c | grep -v '; %xmm'

 20 cmovnz %edx %eax -&gt; %eax
  5 cmp    %ecx $0x00000039
  4 cmp    %ecx (%eax)

grep -A 1 'Error #.*UNINITIALIZED' *.txt | grep '# 0' | awk '{print $4}' | sort | uniq -c | sort -n

210 CRNSS.dll!DecodeGroup
215 net.dll!net::QuicUnackedPacketMap::AddPacket
223 net.dll!net::CookieMonster::GetCookiesWithOptionsAsync
279 hunspell::NodeReader::ReaderForListAt
294 content.dll!base::Bind&lt;&gt;
296 net.dll!std::_List_buy&lt;&gt;::_Buynode&lt;&gt;
409 media.dll!std::vector&lt;&gt;::push_back
620 std::deque&lt;&gt;::push_back
749 base.dll!base::`anonymous
939 net::StaticSocketDataProvider::GetNextRead
945 aura.dll!aura::Window::SetPropertyInternal

1022 CRNSS.dll!DecodeSequence
1321 net.dll!net::QuicConfig::QuicConfig
3089 net.dll!net::HttpResponseInfo::operator=
4106 CRNSS.dll!DecodeItem
22299 base.dll!std::deque<>::emplace_back<>

*** TODO get surrounding code for xmm ones

cd /e/derek/chromium/src/outVS2013/out/Release(master)

~/drmemory/git/build_x86_dbg/bin/drmemory.exe -pause_at_error -pause_at_assert $DRMEM_CHROME_ARGS -dr d:/derek/dr/git/exports -batch -- ./media_unittests.exe --ui-test-action-timeout=12000000 --ui-test-action-max-timeout=28000000 --ui-test-terminate-timeout=12000000 --gtest_filter=AudioInputControllerTest.CreateAndClose

Nearly all are memory-to-memory copies that just use xmm0, like this:

Both 64-bit and 128-bit in size -- so will be in slowpath.

media:
base!std::dequebase::PendingTask,std::allocator<base::PendingTask >::emplace_back<base::PendingTask const &>+0x97 [c:\program files (x86)\microsoft visual studio 12.0\vc\include\deque @ 1161]:
6e692437 f30f6f03 movdqu xmm0,xmmword ptr [ebx]
6e69243b f30f7f07 movdqu xmmword ptr [edi],xmm0
6e69243f f30f7e4310 movq xmm0,mmword ptr [ebx+10h]
6e692444 660fd64710 movq mmword ptr [edi+10h],xmm0
6e692449 8b4b18 mov ecx,dword ptr [ebx+18h]
6e69244c c645fc01 mov byte ptr [ebp-4],1
6e692450 894f18 mov dword ptr [edi+18h],ecx
6e692453 85c9 test ecx,ecx

crypto ECPrivateKeyUnitTest.InitRandomTest:
CRNSS!SEC_QuickDERDecodeItem_Util+0x1b [e:\derek\chromium\src\third_party\nss\nss\lib\util\quickder.c @ 886]:
6e5a049b f30f7e00 movq xmm0,mmword ptr [eax]
6e5a049f 8b4008 mov eax,dword ptr [eax+8]
6e5a04a2 6a01 push 1
6e5a04a4 51 push ecx
6e5a04a5 8945fc mov dword ptr [ebp-4],eax
6e5a04a8 8d45f4 lea eax,[ebp-0Ch]
6e5a04ab 50 push eax
6e5a04ac 52 push edx
6e5a04ad ff750c push dword ptr [ebp+0Ch]
6e5a04b0 660fd645f4 movq mmword ptr [ebp-0Ch],xmm0
6e5a04b5 e836f9ffff call CRNSS!DecodeItem (6e59fdf0)

so while the deque one may be amenable to a heuristic, the crypto one has
too much in between and needs true shadowing and propagation...unless we're
very aggressive.

crypto ECPrivateKeyUnitTest.InitRandomTest also:
CRNSS!DecodeItem+0x8a [e:\derek\chromium\src\third_party\nss\nss\lib\util\quickder.c @ 666]:
6d01fe7a f30f7e45cc movq xmm0,mmword ptr [ebp-34h]
6d01fe7f 8b45d4 mov eax,dword ptr [ebp-2Ch]
6d01fe82 8945ec mov dword ptr [ebp-14h],eax
6d01fe85 660fd645e4 movq mmword ptr [ebp-1Ch],xmm0
6d01fe8a 6a01 push 1
6d01fe8c 8d45d8 lea eax,[ebp-28h]
6d01fe8f 50 push eax
6d01fe90 51 push ecx
6d01fe91 e8ca030000 call CRNSS!GetItem (6d020260)

net HttpBasicStateTest.ConstructsProperly:
net!net::HttpResponseInfo::operator=+0x149 [e:\derek\chro...

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 25, 2014 07:39:57

...mium\src\net\http\http_response_info.cc @ 150]:
66b61a39 f30f6f878c000000 movdqu xmm0,xmmword ptr [edi+8Ch]
66b61a41 f30f7f868c000000 movdqu xmmword ptr [esi+8Ch],xmm0
66b61a49 8a879c000000 mov al,byte ptr [edi+9Ch]
66b61a4f 88869c000000 mov byte ptr [esi+9Ch],al
66b61a55 8bbfa0000000 mov edi,dword ptr [edi+0A0h]
66b61a5b 85ff test edi,edi

net QuicCryptoClientStreamTest.ConnectedAfterSHLO
net!net::QuicConfig::QuicConfig+0x30:
66b21f80 f30f6f4620 movdqu xmm0,xmmword ptr [esi+20h]
66b21f85 f30f7f4720 movdqu xmmword ptr [edi+20h],xmm0
66b21f8a f30f7e4630 movq xmm0,mmword ptr [esi+30h]
66b21f8f 660fd64730 movq mmword ptr [edi+30h],xmm0
66b21f94 f30f6f4638 movdqu xmm0,xmmword ptr [esi+38h]
66b21f99 f30f7f4738 movdqu xmmword ptr [edi+38h],xmm0
66b21f9e f30f7e4648 movq xmm0,mmword ptr [esi+48h]
66b21fa3 660fd64748 movq mmword ptr [edi+48h],xmm0
66b21fa8 f30f6f4650 movdqu xmm0,xmmword ptr [esi+50h]
66b21fad f30f7f4750 movdqu xmmword ptr [edi+50h],xmm0
66b21fb2 f30f7e4660 movq xmm0,mmword ptr [esi+60h]
66b21fb7 660fd64760 movq mmword ptr [edi+60h],xmm0
66b21fbc 8b4668 mov eax,dword ptr [esi+68h]
66b21fbf 894768 mov dword ptr [edi+68h],eax

crypto ECPrivateKeyUnitTest.InitRandomTest also:
CRNSS!DecodeSequence+0xf [e:\derek\chromium\src\third_party\nss\nss\lib\util\quickder.c @ 359]:
6e1301bf f30f7e00 movq xmm0,mmword ptr [eax]
6e1301c3 83c710 add edi,10h
6e1301c6 8b4008 mov eax,dword ptr [eax+8]
6e1301c9 8945fc mov dword ptr [ebp-4],eax
6e1301cc 8d45e8 lea eax,[ebp-18h]
6e1301cf 50 push eax
6e1301d0 8d45f4 lea eax,[ebp-0Ch]
6e1301d3 660fd645f4 movq mmword ptr [ebp-0Ch],xmm0
6e1301d8 50 push eax
6e1301d9 e882000000 call CRNSS!GetItem (6e130260)

unit CaptivePortalTabHelperTest.HttpTimeoutLinkDoctor
aura!aura::Window::SetPropertyInternal+0x9b [e:\derek\chromium\src\ui\aura\window.cc @ 861]:
6651f39b f30f6f45c4 movdqu xmm0,xmmword ptr [ebp-3Ch]
6651f3a0 f30f7f00 movdqu xmmword ptr [eax],xmm0
6651f3a4 f30f7e45d4 movq xmm0,mmword ptr [ebp-2Ch]
6651f3a9 660fd64010 movq mmword ptr [eax+10h],xmm0
6651f3ae 8b87e0000000 mov eax,dword ptr [edi+0E0h]
6651f3b4 2b87dc000000 sub eax,dword ptr [edi+0DCh]
6651f3ba a9fcffffff test eax,0FFFFFFFCh

unit CaptivePortalTabHelperTest.HttpTimeoutLinkDoctor also
unit_tests!aura::TestScreen::GetDisplayMatching+0x2b [e:\derek\chromium\src\ui\aura\test\test_screen.cc @ 143]:
02eb6c1b f30f7e4140 movq xmm0,mmword ptr [ecx+40h]
02eb6c20 660fd64030 movq mmword ptr [eax+30h],xmm0
02eb6c25 8be5 mov esp,ebp
02eb6c27 5d pop ebp
02eb6c28 c20800 ret 8

unit CaptivePortalTabHelperTest.HttpTimeoutLinkDoctor also
cc!cc::LayerTreeHost::LayerTreeHost+0x16c [e:\derek\chromium\src\cc\trees\layer_tree_host.cc @ 125]:
65659b5c f30f6f808c000000 movdqu xmm0,xmmword ptr [eax+8Ch]
65659b64 f30f7f8348010000 movdqu xmmword ptr [ebx+148h],xmm0
65659b6c f30f7e809c000000 movq xmm0,mmword ptr [eax+9Ch]
65659b74 660fd68358010000 movq mmword ptr [ebx+158h],xmm0
65659b7c c7836001000000000000 mov dword ptr [ebx+160h],0
65659b86 c7836401000000000000 mov dword ptr [ebx+164h],0

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 25, 2014 09:27:27

Labels: -Priority-Medium Priority-High Hotlist-Chrome

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 26, 2014 08:10:54

After implementing a simple check modeled on the issue #471 check, most of the errors go away.

*** DONE gather stats: big slowpath perf hit? nontrivial but livable
CLOSED: [2014-02-25 Tue 17:21]
- State "DONE" from "TODO" [2014-02-25 Tue 17:21]

conclusions: while there are a bunch of movdqu in the slowpath, they're
less than half of the total, and there are more andor def exceptions than
these xmm exceptions.

crypto ECPrivateKeyUnitTest.InitRandomTest:
def exceptions: andor: 5057, rawmemchr: 0, strrchr: 0
more def exceptions: fldfst: 82
Per-opcode slow path executions:
8 and: 42
14 cmp: 7
18 push: 131
21 pusha: 2
22 popa: 2
48 jmp: 1711
55 mov: 6091
56 mov: 113
60 test: 2
62 xchg: 8
141 movq: 164
142 movdqu: 4261
184 cpuid: 18
205 bswap: 2798
293 movups: 273
298 movlpd: 97
389 stos: 19
631 xgetbv: 2
Per-size slow path executions:
1-byte: 0
2-byte: 95
4-byte: 10847
8-byte: 261
Other: 4538

unit CaptivePortalTabHelperTest.HttpTimeoutLinkDoctor
def exceptions: andor: 35588, rawmemchr: 0, strrchr: 0
more def exceptions: fldfst: 535
Per-opcode slow path executions:
5 or: 126
8 and: 2944
10 sub: 1
14 cmp: 9137
18 push: 14375
20 pop: 7
48 jmp: 10759
55 mov: 42753
56 mov: 8831
60 test: 77
62 xchg: 29
141 movq: 84
142 movdqu: 32058
143 movdqa: 10
184 cpuid: 50
195 movzx: 2868
200 movsx: 16
205 bswap: 59663
293 movups: 3470
295 movupd: 2
296 movsd: 961
298 movlpd: 194
387 movs: 16
389 stos: 284
390 rep stos: 116
394 rep cmps: 750
631 xgetbv: 5
Per-size slow path executions:
1-byte: 1433
2-byte: 4550
4-byte: 146824
8-byte: 1239
Other: 35540

*** TODO still seeing uninits due to more complex sequences

% grep -A 1 'Error #.*UNINIT' /e/derek/chromium/src/OUT-i1453 | grep -vE 'UNINIT|^--' | awk '{print $4}' | sort | uniq -c
9 icuuc.dll!icu_46::UTS46::processLabel
14 net.dll!base::internal::Invoker<>::Run
72 net.dll!base::internal::ReturnAsParamAdapter<>
2 testing::internal::CmpHelperEQ<>

icuuc and CmpHelperEq are not xmm0, so we really just have 2 here:

[ RUN ] FileStreamTest.AsyncOpenExplicitClose
7036
7036 Error #1: UNINITIALIZED READ: reading 0x06b4f6e4-0x06b4f6e8 4 byte(s) within 0x06b4f6e0-0x06b4f6f0
7036 # 0 net.dll!base::internal::ReturnAsParamAdapter<> [base\task_runner_util.h:23]
7036 # 1 net.dll!base::internal::Invoker<>::Run [base\bind_internal.h:1253]
7036 # 2 base.dll!base::anonymous namespace'::PostTaskAndReplyRelay::Run [base\threading\post_task_and_reply_impl.cc:42] \~~7036~~ # 3 base.dll!base::internal::Invoker<>::Run [base\bind_internal.h:1169] \~~7036~~ # 4 base.dll!base::anonymous namespace'::WorkItemCallback [base\threading\worker_pool_win.cc:32]
7036 # 5 ntdll.dll!RtlpTpWorkCallback
7036 # 6 ntdll.dll!TppWorkerThread
7036 # 7 KERNEL32.dll!BaseThreadInitThunk
7036 Note: @0:00:16.363 in thread 7036
7036 Note: instruction: movdqu (%eax) -> %xmm0

2 problems here: need to store 2 at once! and store addr depends on reg
written in between load and store.

net!base::internal::ReturnAsParamAdapternet::FileStream::Context::OpenResult+0x17 [e:\derek\chromium\src\base\task_runner_util.h @ 23]:
6044f0e7 f30f6f00 movdqu xmm0,xmmword ptr [eax]
6044f0eb f30f7e4810 movq xmm1,mmword ptr [eax+10h]
6044f0f0 8b450c mov eax,dword ptr [ebp+0Ch]
6044f0f3 f30f7f00 movdqu xmmword ptr [eax],xmm0
6044f0f7 660fd64810 movq mmword ptr [eax+10h],xmm1
6044f0fc 8be5 mov esp,ebp
6044f0fe 5d pop ebp
6044f0ff c3 ret

This one would need more complex determination of the store address:

net!base::internal::Invoker<1,base::internal::BindState<base::internal::RunnableAdapter<net::FileStream::Context::IOResult (__thiscall net::FileStream::Context::*)(void)>,net::FileStream::Context::IOResult __cdecl(net::FileStream::Context *),void __cdecl(base::internal::UnretainedWrappernet::FileStream::Context)>,net::FileStream::Context::IOResult __cdecl(net::FileStream::Context *)>::Run+0x36 [e:\derek\chromium\src\base\bind_internal.h @ 1169]:
60450156 f30f6f00 movdqu xmm0,xmmword ptr [eax]
6045015a 8b4508 mov eax,dword ptr [ebp+8]
6045015d f30f7f00 movdqu xmmword ptr [eax],xmm0
60450161 8be5 mov esp,ebp
60450163 5d pop ebp
60450164 c3 ret

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 26, 2014 18:18:31

This issue was closed by revision r1746 .

Status: Fixed

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 27, 2014 13:56:29

Uh-oh, here are some more complex ones. They are passing a 16-byte struct to a caller:

Error #2: UNINITIALIZED READ: reading 0x0041f068-0x0041f06c 4 byte(s) within 0x0041f05c-0x0041f06c

0 extensions::WebNavigationTabObserver::DidStartProvisionalLoadForFrame [e:\derek\chromium\src\chrome\browser\extensions\api\web_navigation\web_navigation_api.cc:361](0x0338aa93 <unit_tests.exe+0x249aa93) modid:1

1 content.dll!content::WebContentsImpl::DidStartProvisionalLoad [e:\derek\chromium\src\content\browser\web_contents\web_contents_impl.cc:2024](0x633de435 <content.dll+0x2de435) modid:3

0338aa93 f30f6f856cffffff movdqu xmm0,xmmword ptr [ebp-94h]
0338aa9b 83ec10 sub esp,10h
0338aa9e 8d4f14 lea ecx,[edi+14h]
0338aaa1 8bc4 mov eax,esp
0338aaa3 f30f7f00 movdqu xmmword ptr [eax],xmm0
0338aaa7 e884c8ffff call unit_tests!extensions::FrameNavigationState::CanSendEvents (03387330)

FrameNavigationState::FrameID frame_id(frame_num, render_view_host);
if (!navigation_state_.CanSendEvents(frame_id))
return;

0:010> dt -v unit_tests!extensions::FrameNavigationState::FrameID
struct extensions::FrameNavigationState::FrameID, 7 elements, 0x10 bytes

Several more like that:

Error #1: UNINITIALIZED READ: reading 0x0041f058-0x0041f05c 4 byte(s) within 0x0041f04c-0x0041f05c

0 extensions::WebNavigationTabObserver::DidStartProvisionalLoadForFrame [e:\derek\chromium\src\chrome\browser\extensions\api\web_navigation\web_navigation_api.cc:359](0x0338aa60 <unit_tests.exe+0x249aa60) modid:1

1 content.dll!content::WebContentsImpl::DidStartProvisionalLoad [e:\derek\chromium\src\content\browser\web_contents\web_contents_impl.cc:2024](0x633de435 <content.dll+0x2de435) modid:3

unit_tests!extensions::WebNavigationTabObserver::DidStartProvisionalLoadForFrame+0x90 [e:\derek\chromium\src\chrome\browser\extensions\api\web_navigation\web_navigation_api.cc @ 359]:
0338aa60 f30f6f855cffffff movdqu xmm0,xmmword ptr [ebp-0A4h]
0338aa68 8d4f14 lea ecx,[edi+14h]
0338aa6b ff7520 push dword ptr [ebp+20h]
0338aa6e ff7518 push dword ptr [ebp+18h]
0338aa71 ff7580 push dword ptr [ebp-80h]
0338aa74 83ec10 sub esp,10h
0338aa77 8bc4 mov eax,esp
0338aa79 83ec10 sub esp,10h
0338aa7c f30f7f00 movdqu xmmword ptr [eax],xmm0
0338aa80 8bc4 mov eax,esp
0338aa82 f30f6f856cffffff movdqu xmm0,xmmword ptr [ebp-94h]
0338aa8a f30f7f00 movdqu xmmword ptr [eax],xmm0
0338aa8e e8add5ffff call unit_tests!extensions::FrameNavigationState::TrackFrame (03388040)

Error #3: UNINITIALIZED READ: reading 0x0041e5ec-0x0041e5f0 4 byte(s) within 0x0041e5e0-0x0041e5f0

0 extensions::WebNavigationTabObserver::DidCommitProvisionalLoadForFrame [e:\derek\chromium\src\chrome\browser\extensions\api\web_navigation\web_navigation_api.cc:391](0x03389e75 <unit_tests.exe+0x2499e75) modid:1

1 content.dll!content::WebContentsImpl::DidCommitProvisionalLoad [e:\derek\chromium\src\content\browser\web_contents\web_contents_impl.cc:2117](0x633dd1a4 <content.dll+0x2dd1a4) modid:3

unit_tests!extensions::WebNavigationTabObserver::DidCommitProvisionalLoadForFrame+0x75 [e:\derek\chromium\src\chrome\browser\extensions\api\web_navigation\web_navigation_api.cc @ 391]:
03389e75 f30f6f85dcfeffff movdqu xmm0,xmmword ptr [ebp-124h]
03389e7d 53 push ebx
03389e7e 83ec10 sub esp,10h
03389e81 8d4efc lea ecx,[esi-4]
03389e84 8bc4 mov eax,esp
03389e86 f30f7f00 movdqu xmmword ptr [eax],xmm0
03389e8a e8f1110000 call unit_tests!extensions::WebNavigationTabObserver::IsReferenceFragmentNavigation (0338b080)

etc.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 27, 2014 15:36:34

Re-opening. One observation is that 16-byte memrefs currently always go to slowpath: so I can solve this w/o needing the bitlevel hack, and thus w/o needing to know the store's address.

Status: Started

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on February 27, 2014 18:02:02

Strike that: aligned 16-byte memrefs stay on fastpath.

Here are 2 possible solutions:

***** TODO re-order load to be next to store in app2app

How about re-ordering the load to be next to the store? There's no
intervening write to the load memory. It's only up so high for
performance. So we could add an app2pp pass.

**** TODO add static shadowing of xmm

We could add shadowing of xmm* but no handling of intra-xmm operations.

An added benefit is that this is a step toward full propagation.

We would need to figure out where to put the shadow memory, and if we have
indirection, we'd have to work that into the fastpath.

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on March 02, 2014 10:18:25

I went with the static shadowing as follows:

issue #1453, issue #243: add shadowing of xmm registers and propagation to and from
memory, but no intra-xmm operation propagation

All xmm registers are shadowed, and we propagate to and from them on loads
and stores. For now we only shadow on certain operations we know we can
handle.

The shadowing is indirected, with a pointer to the memory in one new field
in TLS. Raises the spill slots to 6 for this pointer. Adds support for
the indirection to the fastpath, using a 3rd scratch register for
simplicity, and re-loading every time. This could be optimized in the
future.

Puts in place basic OR-combining for inter-xmm copies, but we need more
infrastructure to start mirroring xmm opcode data movements: that's for the
rest of issue #243 .

Generalizes the shadow register layer to handle 16-byte registers (and thus
4-byte shadow values).

Generalizes the slowpath to handle 16-byte registers.

Starts generalizing the fastpath to handle src size != dst size for
super-dword operations, but that is not yet complete.

Does not handle ymm registers yet.

Cleans up the incorrect use of mi->need_nonoffs_reg3 for cases that
actually need memref offsets by adding a separate field
mi->need_offs_early, allowing the xmm usage to request a 3rd scratch reg
without triggering unrelated mem xl8 interactions that assume it needs a
memory offset (we only support aligned in fastpath for now).

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on March 02, 2014 10:18:43

VS2013 uses sequences like this to zero data structures:

xorps   xmm0,xmm0
movdqu  xmmword ptr [ebp-144h],xmm0
movdqu  xmmword ptr [ebp-134h],xmm0

so we need to handle xor as well

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on March 02, 2014 17:53:42

This issue was closed by revision r1755 .

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant