Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critial Failure - Trying to compile Apache Tika using .NET Core 3.1 Tools #85

Closed
dylanlangston opened this issue Jun 27, 2022 · 4 comments · Fixed by #86
Closed

Critial Failure - Trying to compile Apache Tika using .NET Core 3.1 Tools #85

dylanlangston opened this issue Jun 27, 2022 · 4 comments · Fixed by #86

Comments

@dylanlangston
Copy link

Hello, I am trying to use IKVM to compile a .NET version of Apache Tika (https://tika.apache.org/). Essentially the same goal as Tika on .NET but targeting .NET 6.0.

I am able to use the .NET Framework tools IKVM-8.2.0-prerelease.911-tools-net461-any without issue. When attempting the same procedure using the .NET Core 3.1 tools IKVM-8.2.0-prerelease.911-tools-netcoreapp3.1-win7-x64 I'm running into a bit of a snag.


To test compatibility I ran the following command against the .NET Framework tools:

C:\Users\dlangston\Downloads\IKVM-8.2.0-prerelease.911-tools-net461-any\ikvm.exe -jar C:\Users\dlangston\Downloads\Tika\tika-app-2.4.1.jar C:\Users\dlangston\Desktop\test.svg
This produced the expected output with no errors.

I repeated this using the .NET Core 3.1 tools but encounter an error related to loading WinForms (source of issue seen here):
C:\Users\dlangston\Downloads\IKVM-8.2.0-prerelease.911-tools-net461-any\ikvm.exe -jar C:\Users\dlangston\Downloads\Tika\tika-app-2.4.1.jar C:\Users\dlangston\Desktop\test.svg

System.IO.FileNotFoundException: Could not load file or assembly 'System.Windows.Forms, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'. The system cannot find the file specified.
File name: 'System.Windows.Forms, Version=1.0.5000.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'
   at System.Reflection.RuntimeAssembly.nLoad(AssemblyName fileName, String codeBase, RuntimeAssembly assemblyContext, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, AssemblyLoadContext assemblyLoadContext)
   at System.Reflection.RuntimeAssembly.InternalLoadAssemblyName(AssemblyName assemblyRef, StackCrawlMark& stackMark, AssemblyLoadContext assemblyLoadContext)
   at System.Reflection.Assembly.Load(String assemblyString)
   at IKVM.Internal.JVM.CriticalFailure(String message, Exception x) in D:\a\ikvm\ikvm\src\IKVM.Runtime\vm.cs:line 272

Following that error I attempted to convert the Jar to a Library:
C:\Users\dlangston\Downloads\IKVM-8.2.0-prerelease.911-tools-netcoreapp3.1-win7-x64\ikvmc.exe -assembly:tika-app -classloader:ikvm.runtime.AppDomainAssemblyClassLoader -target:library C:\Users\dlangston\Downloads\Tika\tika-app-2.4.1.jar -nostdlib -reference:C:\Users\dlangston\Downloads\IKVM-8.2.0-prerelease.911-tools-netcoreapp3.1-win7-x64\refs\*.dll

I imported that library into a new C# project and attempted to create a new instance of Tika:

using org.apache.tika;

var tika = new Tika();

Initially this reproduced the same error from above referencing System.Windows.Forms. After adding WinForms to my project and running again I'm now seeing what I believe is the actually exception.

Unfortunately that was the end of my rope... Anything that can be done to continue troubleshooting? Any insight at this point is greatly appreciated!!

---------------------------
IKVM.NET 8.2.0.0 Critical Failure
---------------------------
****** Critical Failure: Exception during finishing of: java.lang.invoke.BoundMethodHandle$Species_L3 ******



PLEASE FILE A BUG REPORT FOR IKVM.NET WHEN YOU SEE THIS MESSAGE



(on Windows you can use Ctrl+C to copy the contents of this message to the clipboard)



IKVM.Runtime, Version=8.2.0.0, Culture=neutral, PublicKeyToken=13235d27fcbfff58

C:\Program Files\dotnet\shared\Microsoft.NETCore.App\6.0.5\

6.0.5 64-bit



System.TypeLoadException: Method 'speciesData' on type 'java.lang.invoke.BoundMethodHandle$Species_L3' from assembly 'ikvm_dynamic_assembly__950467578, Version=2022.627.1112.42966, Culture=neutral, PublicKeyToken=null' is overriding a method that is not visible from that assembly.

   at System.Reflection.Emit.TypeBuilder.CreateTypeNoLock()

   at System.Reflection.Emit.TypeBuilder.CreateType()

   at IKVM.Internal.DynamicTypeWrapper.FinishContext.FinishImpl()

   at IKVM.Internal.DynamicTypeWrapper.JavaTypeImpl.FinishCore()

   at System.Reflection.Emit.TypeBuilder.CreateTypeNoLock()

   at System.Reflection.Emit.TypeBuilder.CreateType()

   at IKVM.Internal.DynamicTypeWrapper.FinishContext.FinishImpl()

   at IKVM.Internal.DynamicTypeWrapper.JavaTypeImpl.FinishCore()



   at IKVM.Internal.JVM.CriticalFailure(String message, Exception x)

   at IKVM.Internal.DynamicTypeWrapper.JavaTypeImpl.FinishCore()

   at IKVM.Internal.DynamicTypeWrapper.JavaTypeImpl.Finish()

   at IKVM.Internal.DynamicTypeWrapper.Finish()

   at IKVM.Java.Externs.sun.misc.Unsafe.ensureClassInitialized(Object thisUnsafe, Class clazz)

   at sun.misc.Unsafe.ensureClassInitialized(Class clazz)

   at java.lang.invoke.BoundMethodHandle.Factory.generateConcreteBMHClass(String types)

   at java.lang.invoke.BoundMethodHandle.SpeciesData.get(String types)

   at java.lang.invoke.BoundMethodHandle.SpeciesData.access$200(String x0)

   at java.lang.invoke.BoundMethodHandle.getSpeciesData(String types)

   at java.lang.invoke.BoundMethodHandle.checkCache(Int32 size, String types)

   at java.lang.invoke.BoundMethodHandle.speciesData_LLL()

   at java.lang.invoke.MethodHandleImpl.makeGuardWithTestForm(MethodType basicType)

   at java.lang.invoke.MethodHandleImpl.makeGuardWithTest(MethodHandle test, MethodHandle target, MethodHandle fallback)

   at java.lang.invoke.MethodHandles.guardWithTest(MethodHandle test, MethodHandle target, MethodHandle fallback)

   at com.jmatio.io.MatFileReader.unmapHackImpl()

   at com.jmatio.io.MatFileReader.__<>Anon3.run()

   at java.security.AccessController.doPrivileged(Object action, AccessControlContext context, CallerID callerID)

   at java.security.AccessController.doPrivileged(PrivilegedAction action, CallerID )

   at com.jmatio.io.MatFileReader..cctor()

   at com.jmatio.io.MatFileReader.setAllowObjectDeserialization(Boolean allowDeserialization)

   at org.apache.tika.parser.mat.MatParser..cctor()

   at org.apache.tika.parser.mat.MatParser..ctor()

   at System.RuntimeType.CreateInstanceDefaultCtor(Boolean publicOnly, Boolean wrapExceptions)

   at System.Activator.CreateInstance(Type type, Boolean nonPublic, Boolean wrapExceptions)

   at System.Activator.CreateInstance(Type type)

   at IKVM.Java.Externs.sun.reflect.ReflectionFactory.ActivatorConstructorAccessor.newInstance(Object[] objarr)

   at java.lang.reflect.Constructor.newInstance(Object[] initargs, CallerID )

   at java.lang.Class.newInstance(CallerID )

   at org.apache.tika.utils.ServiceLoaderUtils.newInstance(Class klass, ServiceLoader loader)

   at org.apache.tika.config.ServiceLoader.loadStaticServiceProviders(Class iface, Collection excludes)

   at org.apache.tika.parser.DefaultParser.getDefaultParsers(ServiceLoader , EncodingDetector , Renderer , Collection )

   at org.apache.tika.parser.DefaultParser..ctor(MediaTypeRegistry registry, ServiceLoader loader, Collection excludeParsers, EncodingDetector encodingDetector, Renderer renderer)

   at org.apache.tika.parser.DefaultParser..ctor(MediaTypeRegistry registry, ServiceLoader loader, EncodingDetector encodingDetector, Renderer renderer)

   at org.apache.tika.config.TikaConfig.getDefaultParser(MimeTypes , ServiceLoader , EncodingDetector , Renderer )

   at org.apache.tika.config.TikaConfig..ctor()

   at org.apache.tika.config.TikaConfig.getDefaultConfig()

   at org.apache.tika.Tika..ctor()

   at SYSTM.ContentExtraction.TikaParser.Parse(Stream input, Stream output)

   at SYSTM.ContentExtraction.Tests.Tests.Test1() in C:\Users\dlangston\source\repos\SYSTM.ContentExtraction\SYSTM.ContentExtraction.Tests\UnitTest1.cs:line 21

   at System.RuntimeMethodHandle.InvokeMethod(Object target, Span`1& arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)

   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

   at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)

   at NUnit.Framework.Internal.Reflect.InvokeMethod(MethodInfo method, Object fixture, Object[] args)

   at NUnit.Framework.Internal.MethodWrapper.Invoke(Object fixture, Object[] args)

   at NUnit.Framework.Internal.Commands.TestMethodCommand.InvokeTestMethod(TestExecutionContext context)

   at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)

   at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)

   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()

   at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)

   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.Execute(TestExecutionContext context)

   at NUnit.Framework.Internal.Execution.SimpleWorkItem.<>c__DisplayClass4_0.<PerformWork>b__0()

   at NUnit.Framework.Internal.ContextUtils.<>c__DisplayClass1_0`1.<DoIsolated>b__0(Object _)

   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)

   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)

   at NUnit.Framework.Internal.ContextUtils.DoIsolated(ContextCallback callback, Object state)

   at NUnit.Framework.Internal.ContextUtils.DoIsolated[T](Func`1 func)

   at NUnit.Framework.Internal.Execution.SimpleWorkItem.PerformWork()

   at NUnit.Framework.Internal.Execution.WorkItem.RunOnCurrentThread()

   at NUnit.Framework.Internal.Execution.WorkItem.Execute()

   at NUnit.Framework.Internal.Execution.ParallelWorkItemDispatcher.Dispatch(WorkItem work, ParallelExecutionStrategy strategy)

   at NUnit.Framework.Internal.Execution.ParallelWorkItemDispatcher.Dispatch(WorkItem work)

   at NUnit.Framework.Internal.Execution.CompositeWorkItem.RunChildren()

   at NUnit.Framework.Internal.Execution.CompositeWorkItem.PerformWork()

   at NUnit.Framework.Internal.Execution.WorkItem.RunOnCurrentThread()

   at NUnit.Framework.Internal.Execution.WorkItem.Execute()

   at NUnit.Framework.Internal.Execution.TestWorker.TestWorkerThreadProc()

   at System.Threading.Thread.StartCallback()

image

@dylanlangston dylanlangston changed the title Critial Failure Trying to create new instance of Apache Tika Critial Failure - Trying to create new instance of Apache Tika Jun 27, 2022
@dylanlangston dylanlangston changed the title Critial Failure - Trying to create new instance of Apache Tika Critial Failure - Trying to compile Apache Tika using .NET Core 3.1 Tools Jun 27, 2022
@wasabii
Copy link
Contributor

wasabii commented Jun 28, 2022

Eh, that code that is loading WinForms is lame. That certainly needs to be addressed. We shouldn't be doing dialogs like that for critical failures. But that's going to take some reassessment and time I think.

The actual error, the overriding an assembly thing, is probably going to take some digging into. You say it works fine on Framework? What's this library do?

I see this in your stack:

at sun.misc.Unsafe.ensureClassInitialized(Class clazz)
at java.lang.invoke.BoundMethodHandle.Factory.generateConcreteBMHClass(String types)

This makes me think it's trying to do some runtime bytecode compilation. Do you know if this library is?

@dylanlangston
Copy link
Author

dylanlangston commented Jun 28, 2022

@wasabii Thank you for the reply! Everything is working with Framework correct - the same Jar in .NET 6.0 is generating the stack trace from my original message.

Apache Tika is used to determine a file type, metadata, and text contents. I'm not much of a Java developer and am unsure if it uses runtime bytecode compilation or not. My end goal was to utilize IKVM to compile Tika into a .NET library and implement its parser interface in .NET.

I've got a test harness up on Github here (dylanlangston/IKVM-TikadotNet-Harness) that demonstrates the issue concisely and consistently. It contains two nearly identical projects that use the new IkvmReference Tag (very cool btw) to compile Apache Tika; which can be downloaded separately here. Tikadot462 is working whereas Tikadot6 fails.

@wasabii
Copy link
Contributor

wasabii commented Jun 28, 2022

I believe I've got this one. Tika is trying to generate some dynamic byte code. And what's happening is IKVM is trying to generate a corresponding DynamicAssembly at runtime, with access to the internal members of the Tiki assembly. To do this, the static compiler places a InternalVisibleTo attribute on the generate assembly, referring to the future possibility of a strong-named assembly coming along that matches, and thus has access to internal methods.

However, Core is no longer able to use StrongNameKeyPair to sign assemblies. Throws a PlatformNotSupportedException. And thus the generation of the assembly fails.

And then all this gets hidden by the WinForms error code stuff.

So, strong names are not really a useful thing in Core. And thus, I've changed the Static Compiler on Core to not emit an InternalsVisibleTo attribute with teh public key, nor to generate a dynamic assembly with the key. InternalsVisibleTo is still required, it just doesn't have to match on public key anymore.

I've left the code that does all this on Framework. Though, I'm pretty sure this is no longer required on Framework either. Need to develop a test case to confirm that. But the forged key pair stuff should be able to go away on .NET 4 as well.

@wasabii
Copy link
Contributor

wasabii commented Jun 28, 2022

So, no. There's no security benefit here on .NET 4. All of the validation of this stuff at runtime is gone.

However, there is still a requirement that InternalVisibleTo attributes contain strong names when installing an assembly into the GAC. So, as long as we want GAC support for our generated assemblies, this still needs to be here.

I'd +1 for removing GAC support for our generated assemblies. I don't think we should waste time or effort catering to it in this day and age. Maybe unpopular opinion? I don't know. Either way, problem for another day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants