Error Logs in Table Store are losing data (and Occasional unhandled exception in ITableEntity.ReadEntity) #990

Closed
TimLovellSmith opened this Issue Apr 2, 2013 · 5 comments

2 participants

@TimLovellSmith
NuGet member

System.Web.HttpUnhandledException (0x80004005): Exception of type 'System.Web.HttpUnhandledException' was thrown. ---> Microsoft.WindowsAzure.Storage.StorageException: The given key was not present in the dictionary. ---> System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
at System.Collections.Generic.Dictionary2.get_Item(TKey key)
at NuGetGallery.Infrastructure.ErrorEntity.Microsoft.WindowsAzure.Storage.Table.ITableEntity.ReadEntity(IDictionary
2 properties, OperationContext operationContext) in c:\TeamCity\buildAgent\work\5d00fe9dafc32d61\Website\Infrastructure\TableErrorLog.cs:line 62
at Microsoft.WindowsAzure.Storage.Table.EntityUtilities.ResolveEntityByTypeTElement
at Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.<>c__DisplayClass81.<TableQueryPostProcessGeneric>b__6(String pk, String rk, DateTimeOffset ts, IDictionary2 prop, String etag)
at Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.ReadAndResolve(ODataEntry entry, EntityResolver resolver)
at Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.TableQueryPostProcessGenericTElement
at Microsoft.WindowsAzure.Storage.Table.TableQuery1.<>c__DisplayClassa2.b__9(RESTCommand1 cmd, HttpWebResponse resp, Exception ex, OperationContext ctx)
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ProcessEndOfRequest[T](ExecutionState
1 executionState, Exception ex)
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSyncT
--- End of inner exception stack trace ---
at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSyncT
at Microsoft.WindowsAzure.Storage.Table.TableQuery1.<>c__DisplayClass2.<Execute>b__1(IContinuationToken continuationToken)
at Microsoft.WindowsAzure.Storage.Core.Util.General.<LazyEnumerable>d__0
1.MoveNext()
at System.Linq.Buffer1..ctor(IEnumerable1 source)
at System.Linq.Enumerable.ToArrayTSource
at NuGetGallery.Infrastructure.AzureEntityList1.<GetRange>d__e.MoveNext() in c:\TeamCity\buildAgent\work\5d00fe9dafc32d61\Website\Infrastructure\AzureEntityList.cs:line 196
at System.Linq.Buffer
1..ctor(IEnumerable1 source)
at System.Linq.Enumerable.<ReverseIterator>d__a0
1.MoveNext()
at NuGetGallery.Infrastructure.TableErrorLog.GetErrors(Int32 pageIndex, Int32 pageSize, IList errorEntryList) in c:\TeamCity\buildAgent\work\5d00fe9dafc32d61\

@TimLovellSmith
NuGet member

Prioritization notes - I haven't seen evidence that it affects anything except the error log view, and even that normally works if you retry a few times.

@TimLovellSmith
NuGet member

Raising priority as it seems to be happening with increasing regularity (maybe as size of error log grows)?

@TimLovellSmith
NuGet member

So I've been debugging this and it turns out the table entities which are coming back and causing problems have a 'Place_Held' property but no SerializedError property. Or paraphrasing, someone wished to create an error entity, but gave up half-way, creating instead a problem child.

@TimLovellSmith
NuGet member

Of course having written code to work around this scenario, I now find it difficult to reproduce any more...
The reason for this is that any time a new error occurs and successfully gets logged, it ovewrites the problem child. Will be forcing reproduction by doing some kill process hackery on my gallery instance.

@TimLovellSmith
NuGet member

While investigating this I discovered something much more interesting (and shocking!)
Some errors don't get written to the table store at all because table store returns 400 bad request.
It turns out table storage has a completely annoying limitation: 'String values [in entity properties] may be up to 64 KB in size.'
This may mean we are losing a lot of interesting errors instead of logging them! :(

@bhuvak bhuvak closed this Apr 26, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment