Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import statement please read #85

Open
marcobambini opened this issue Mar 15, 2017 · 26 comments
Open

Import statement please read #85

marcobambini opened this issue Mar 15, 2017 · 26 comments
Labels

Comments

@marcobambini
Copy link
Owner

Current import statement is a static operation performed at compile time, there is no runtime import statement in current version.

When you write:
import "file1.gravity"
what happens under the hood is that the content of file1.gravity is textually inserted in that position then parsed and compiled (it is far more efficient than this but you get the point).

It turns out that importing should be also a runtime operation in order to be able to import already compiled bytecode or shared libraries (and even source code I think).

So my first question is how do you think this ambiguity should be solved?
import at static time and import as runtime should have two different keyword?
For example include for static and import for runtime?
More question will follow up.

@ghost
Copy link

ghost commented Mar 15, 2017

Yeah, at first I was a little confuse about the usage of the word import. Since most of the time import is used for importing an module or something a like. I think it would be correct to use include for static and import for runtime

@brandon-ray
Copy link
Contributor

I think having two different keywords will work fine.

However I would want to see if it would be possible to detect that a "static import" could be done though, so we could only have the import keyword. For instance if an import is done at the top of a file, out of all scope blocks and only in the main scope, or the import statement is a literal then it could do the "static import" instead of the "runtime import" and avoid confusion. We could make a note in the documentation about this.

@marcobambini
Copy link
Owner Author

I like your suggestion @brandon-ray so:
import "file1.gravity" would be static
while
import file1 would be dynamic
then at runtime Gravity will look for file1.gravity, file1.g and file1.so (if on Linux)

@ghost
Copy link

ghost commented Mar 15, 2017

That would be nice. Although it would need to be "heavy" documented. Because, people would get confuse.

@brandon-ray
Copy link
Contributor

Yes something like that @marcobambini. Though you should still be able to import dynamically at runtime even with a string literal. I suppose there would need to be a check for this, I could see the logic going something like this:

if (import is a literal string) {
    if (import is a .gravity file and exists at compile time) {
        //Perform static import...
   } else {
       //Perform dynamic import later on at runtime...
   }
} else {
    //Perform dynamic import later on at runtime...
}

This is assuming that we can only do static imports for .gravity files though, which seems to be the case.

@ghost
Copy link

ghost commented Mar 15, 2017

Why an source file would not be available at compile time?

@brandon-ray
Copy link
Contributor

It would cover the case that someone is using a conditional import like in an if block or a try catch block. If we throw an error at compile time then these will not be able to work.

@ecanuto
Copy link

ecanuto commented Mar 15, 2017

If you guys don't mind, I would like to suggest another way. What about have import statement for compile time import and import function for runtime import. Like this:

import "file.gavity"; // static import
import file.gravity; // static import
import("file.gravity"); // runtime import

I don't know why but it makes more sense for me.

@sargun
Copy link
Contributor

sargun commented Mar 16, 2017

why not:

link "file.gravity"; // static import
import "fily.gravity"; // runtime import

@marcobambini
Copy link
Owner Author

@sargun we are trying to not introduce another reserved keyword

@kazzkiq
Copy link
Contributor

kazzkiq commented Mar 16, 2017

In Brunch JavaScript projects, there are npm packages and source files. You can use the same keyword to import both, what differs is how you call them. Example:

import { Ajax } from 'ajax-lib'    // this lib comes from NPM
import { MySuperClass } from './helpers/MySuperClass'    // the "./" assumes it isn't a npm package

Perhaps something similar could be used in Gravity:

import "binary" // assumes an already compiled source
import "gravity-compiled-file.g" // same as above
import "gravity-source-file.gravity" // assumes non-compiled file

This way you would let Gravity knows how to import those files based on their extensions (non-extension and .g are interpreted as bytecode, while .gravity is interpreted as non-compiled source code).

@ecanuto
Copy link

ecanuto commented Mar 16, 2017

@kazzkiq the problem is not the place where lib or file is located. The problem is when import file, at compile time or runtime.

@kazzkiq
Copy link
Contributor

kazzkiq commented Mar 16, 2017

@ecanuto Exactly. But how do you define what should be imported at runtime or compile time? My proposal is to let Gravity choose how to handle an import (add it at compile time or runtime) based on its file extension.

@ecanuto
Copy link

ecanuto commented Mar 17, 2017

@kazzkiq Maybe I am wrong but you suggest a way to import two kinds of files, one already compiled (byte code) and other not compiled (source code). The problem here is not the compiled or non compiled file, the problem is that we want to compile file at compile time or at runtime.

Think about this example:

import "binary1" // compiled, I want to import at compile time
import "gravity-compiled-file1.g" // compiled, I want to import at compile time
import "gravity-source-file1.gravity" // non-compiled file, I want to import at compile time

import "binary2" // compiled, I want to import at runtime
import "gravity-compiled-file2.g" // compiled, I want to import at runtime
import "gravity-source-file2.gravity" // non-compiled file, I want to import at runtime

How we will decide when import binary1 and binary2, same syntax but I want to have it imported at different times. The file binary1 must exists when source code is compile but binary2 must exists only while executing program.

I think that import instruction and import function could be more reasonable and instruction could be allowed only at beginning of program while function could be used at any place.

@JoshTheDerf
Copy link
Contributor

JoshTheDerf commented Mar 17, 2017

Well, if the preference is to avoid an additional keyword, could we just re-use the static keyword for static imports?

// Static Imports
import static "./binary2" // Inline at compile-time from target file.
import static "./something.g" // Import compiled target file.
import static "./something.gravity" // Import and compile target source file.

// Dynamic / Runtime Imports
import "./binary2" // Inline at runtime from target file.
import "./something.g" // Import compiled target file.
import "./something.gravity" // Import and run target source file.

Otherwise, I'd vote for the include keyword for static imports (because you're literally including it in the file) and keep the import keyword for dynamic imports.

In my opinion, using file paths or extensions to determine where to import modules from is a somewhat dangerous methodology, as you're removing choice and use-cases that might actually be beneficial for some people.

@marcobambini
Copy link
Owner Author

Based on this discussion I'd vote for include for static and import for runtime.

@marcobambini
Copy link
Owner Author

In order to not include a new keyword I think I am going to introduce a new macro (the other one is #unittest): #include "file.gravity" that will statically include "file.gravity" into currently parsed file.

Runtime import keyword will be a syntactic sugar for System.import so we can implement it without modifying the virtual machine. We need to find out which options and which syntax the runtime import keyword will support.

@Jezza
Copy link

Jezza commented Mar 20, 2017

Might be worth checking out Lua's require system.

@Jezza
Copy link

Jezza commented Mar 20, 2017

Although, that might imply a semantic change within the language, as lua can execute the file with the same environment, vs if it was copied verbatim.

@YurySolovyov
Copy link

Maybe do it via System method?
Like System.import(path)
Or introduce Module namespace if that feels more elegant...

@IngwiePhoenix
Copy link
Contributor

IngwiePhoenix commented Nov 17, 2018

Hey guys. :)

I had previously opened an issue and was referred here, so I thought I would share my personal opinion as well.

First: I think the import syntax that ES6 uses is awkward. Why? Well, it evolved during it's introduction. First, there seemed to only be "default imports:

import Audio from "audio";

But then, we had a new syntax to specify explicit pieces we wanted:

import {Playback} from "audio";

So what I would NOT like Gravity to do, is to introduce default exports at all. Instead:

import Foo, Bar from "module";

This would make people more aware of what they are to export. Thus, although it may sound a little bit like more work, picking exactly which things you are going to import from a model decreases the likelyhood of polluting a namespace. Plus, you actually are optimizing, as you only use what you really need, instead of importing pretty much everything - only to then figure out what you actually imported alltogether...

Second: Using #include feels very much like C - and makes most people think, at least me, about "including file A into current file". This macro has been around for a very long time, and the C preprocessor has always just done one and the same thing: Using that macro to copy one source file into another, so that forward declarations are available to the rest of the code, so it can be properly read and processed - thus, ultimatively be compiled. It would read nice to see static import... - but using #include makes far more sense, as I just said.

Third: NPM isn't exactly decentralized - where as for Gravity, I would totally prefer that. This way, people can easily host their own packages at any place. miniz may be used - or a locally available libz - to support Zip files. They are easy to make and distribute. They can also be uploaded basically anywhere. Now, what a package manager should do, is utilize the URL - to be precise, the host section: https://github.com/... --> github.com:. Now, what comes after the colon could essentially be either the URL path, if desireable, or a user/package path. This one might even be easy to pick up from a downloaded file's package description. Considering one writing { "name": "me/testpackage" } into a JSON-based description, the package manager may load this package by github.com:me/testpackage. However - thinking about the directory tree, it may make more sense to actually create subfolders instead - like github.com/me/testpackage and not use a colon. But it works well for demonstration purposes, I guess...

Fourth: Paths. I really like the idea of setting a library path and not hardcoding it - however, it might be easier to actually soft-code a path. Let's say, the user is trying out Gravity and just wants to fool around with it (which is what I usually do when I run across something cool...), he may not want to dig up environment variables or command line switches, and "just" wants to get going. Now, it could be possible to soft-code something like gravity_modules and let the user append to it via CLI flags, C API or any other usable method. In fact, there might also be a switch to just make Gravity's soft-coded configuration blank, so the user could absolutely specify everything by himself. This way, the gravity_modules folder would not be always present, but could be moved out of the search path at any time, easily.

Now, this is kind of related here: File extensions. I like .g a lot, and I haven't seen .n before. So I would definitively go with these two. Now, as for native modules, you would have these options:

  • Check for lib*.so or *.so on Linux and MacOS,
  • Check for lib*.dylib, lib*.so, lib*.plugin or *.dylib, *.so or *.plugin on MacOS (and there are way more extensions like that, mind you...)
  • Or check for *.dll on Windows. Now, sometimes, some folks even use .so on Windows, which makes things even harder.

My suggestion: Don't. Do. This.
Instead, make the user rename their native module to something like .gso or similiar. I was actually thinking about .gn, but that one is taken ("Generate Ninja", Google). This is also what NodeJS does - you have to name them *.node in order to have them found. Although there are modules that work around that, most likely, this is their "convention".

Fifth: FFI. Now, it really depends on an API if implementing native bindings through C is preferred over FFI or vice versa. But what if we just did not include this into the "core" - but instead made it an example module? What I am talking about is: https://github.com/libffi/libffi

By stating that the preferred way is to use the C API, but allowing someone to create bindings through a port of libffi as well, we would be able to cherry-pick the desired method. Now, libffi only works for C - as far as I know - but it would still be very useful to have an alternative approach sometimes, if desired.

To sum up: Prefer Gravity C-API based native modules over FFI, but make FFI available as an alternative AND as an example of a native module (well, complex example, I guess - but an example still!)

Sixth: Defining modules natively. There are two ways in which a module may be loaded: Through the script, or through the C API. Now, what NodeJS does is this:

First, there are actually two types of modules: Internal ones, and externally loaded ones. Internal ones are basically caught in a struct (I didn't find that exact one, must've overseen it). This struct is basically being iterated at the beginning to "mass register" the modules to the whole system, allowing Node a much faster lookup - because those modules are already loaded, and do not need to be searched on disk first only to fall back to looking internally. So, being able to define modules that are "internal" - so, ones that are not actually available on disk, but rather the very library or binary - would probably be quite crucial.

As for external modules, they simply load the very binary, craft a pointer to the given registration and unregistration function - basically, constructor and deconstructor for the lifetime of the binary. This constructor is usually passed some default values - which in this case, might be VM, or something like that - you get the idea. So should Gravity pick up a native module from the disk, it also needs to maintain it's lifetime throughout the process - and let them destruct at process end.

But, that isn't all yet - nope! A similiar mechanic needs to also be applied to internal modules. The reason for that is quite simple: Let me take my concept of a build tool based on a complete scripting language for example here. In order to implement HTTP/S downloading into the system, I need to add a proper library-binding module right into the binary. Now, some of those actually have some initialization and de-initialization functions that need to be called - even when linked into the binary. So we would need to be able to also handle that scenario for internal modules too.

However, this makes something really simple:

  1. Actually, we can reuse existing functions for both - internal, and external.
  2. The internal modules just need to be bound by the C API user. For instance, by calling a couple of registration functions from different internal modules: init_gravitymodule_liba(vm); init_gravitymodule_libb(vm); init_gravitymodule_libN(vm)....
  3. Due to the module cache being made aware of native modules anyway, there is a chance that we "accidentially" allow text-based modules to be pre-registered as well. By using a tool like incbin, a user may also "embed" scripts right away! So as the cache can be filled with pre-registered modules due to internal native modules, this would actually allow internal scripted modules (or, internal text modules, if you prefer that name) to be pre-registered too, and make them available immediately. That...is actually pretty cool.

I mean, not everything is written in C when using an embedded scripting language. So being able to embed some pre-written scripts as an internal module, may come in quite handy. In my case, I am zipping up a lib/ folder, embedding it as a const char[] and const size_t, using miniz to access that file and read it. That way, I can run scripts off the binary, and pre-register the library of scripts - well, could. Currently, I am working on automating this a little... But, that is at least my idea of doing it for now.


Those are all pretty much my own opinion. I did read a lot of cool ideas above, but I am not very good in backtracking through a discussion and picking user names up in order to mention them - kind of a flaw related to my visual impairment which implies a miniature field of vision ^^; So, excuse me for that! I really hope to see this becoming a thing in the future!

Also, sorry for the wall of text... I just realized how long this thing got... o.o;


EDIT (1.40 AM, German time)
I just realized, that I had forgotten to mention my thoughts on weather to use System.import or a Module class.

I believe it should be a separate class. The reason for that is that you would be able to mirror the module cache, search paths and everything in one object/class rather than squeezing it into System - whose name suggests something except module operations. Also, something that I learned from (Wren)[http://wren.io] is that they use a separate Fiber for each module execution in oder to isolate it from other modules. These fibers are structured in the way they appear, as import statements are being made. Later, when the module is within the cache, it does not get re-run, but the exports are simply being returned (which, in Wren, means all "globally" defined variables/classes, which I find rather inpractical). Also, as for the signature, I would suggest:

  • Have one "meta" method: Module.load(moduleName, importsWanted),
  • a method that searches for modules: Module.resolve(moduleName),
  • a method that evaluates each type of module:
    • Module.runScriptModule(resolvedModulePath) -> Object{importName : Value}
    • Module.runNativeModule(resolvedModulePath) -> Native Module handle
      • This one may be implemented through the C-API instead of a script, actually.
    • Module.runBytecodeModule(resolvedModulePath) -> Object{importName : Value}

By splitting the module loading into functions, embedders/users can decide which part they'd like to change. Here is a proposal for such a class:

class Module {

  // Encapsulate the cache - make it freely accessible internally,
  // but read-only to the public.
  // There is probably a more elegant way for this, though...
  private static var _cache;
  public static var cache {
    get { return _cache; }
  }

  /**
   * Store the search path for modules. Has to be used by .resolve().
   * @type {Array}
   */
  private static var _paths = [];
  public static var paths {
    get { return _paths; }
    set {
      if(value is Array) {
        _paths = value;
      } else {
        // Actually, what is the proper way of throwing/causing an error?
        triggerAnError("Can not set Module.paths to something else but Array.");
      }
    }
  }

  private static var _exts = NULL;
  public static var extensions {
    get {
      if(this._exts == NULL) {
        this._exts = new this.ModuleExtensions();
      }
      return this._exts;
    }
  }
  private static class ModuleExtensions {
    // Store a map of extensions and possibly additional processors.
    // By default, this might be:
    // ".g": Module.runScriptModule(...)
    // ".n": Module.runBytecodeModule(...)
    // ".gso": Module.runBinaryModule(...)
    var extensions = {};
    func addExt(ext, processor) {
      if(!(ext in this.extensions)) {
        causeAnError(
          "Given extension is already set. Remove it first to"
          + " add a new processor for it. Overriding a processor is unsafe."
        );
      }
      if(processor is Function) {
        this.extensions[ext] = processor;
      } else {
        causeAnError(
          "Given processor is of type \(typeof processor) - but Function was"
          + " expected."
        );
      }
    }
    func removeExt(ext) {
      if(ext in this.extensions) {
        delete this.extensions[ext];
        return true;
      } else {
        return false;
      }
    }
    func hasExt(ext) {
      return (ext in this.extensions);
    }
    func getProcessor(ext) {
      if(this.hasExt(ext))
        return this.extensions[ext]
      else
        return false;
    }

    // Operator mapping. Im just gonna list instead of "implement" them:
    // o[key]: Get processor
    // o[key] = value: Set processor
    // Call .removeExt() to remove. There might be a better way, though.
    // Maybe when o[key] = value is used and value==NULL, we'll attempt to
    // unset the processor?
  }

  /**
   * Returns the exports required by `importsWanted` for a given module.
   * @param moduleName:    String containing the module name given in the
   *                       import statement.
   * @param importsWanted: Array of variables, objects, classes or alike, that
   *                       the requesting module would like to import from the
   *                       given `moduleName`.
   *
   * @return Should return the imports. Now, since the `import` statement by
   *         itself is actually syntactic shugar for this, there probably has
   *         to be some logic to disassemble the returned map and play out the
   *         new variables in the requesting scope.
   */
  public static func load(moduleName, importsWanted) {
    // ...
    return imports;
  }

  /**
   * Resolve `moduleName` to an actual file. In some scenarios, this one might
   * actually be implemented in C rather than the script - but by default, is
   * probably best implemented in the script - as C-Calls can stack up really
   * fast - and something like resolving happens to look cleaner in a script
   * than in C - at least, most of the time.
   *
   * @param moduleName: The name of the module that is to be resolved.
   *
   * @return A string to the actual file. We do NOT return the source here - 
   *         because the exec*() methods may use custom processing. I.e., the
   *         script-oriented executor will use the raw text, whilst the
   *         Bytecode executor will operate slightly different. Therefore,
   *         this method shall only return a string, which should be used
   *         within the cache.
   */
  public static func resolve(moduleName) {
    // ...
    return modulePath;
  }

  /**
   * Those are supposed to be the module executors. Each of these is actually
   * likely implemented in C rather than the script, as VM access is required.
   * I still put them here for completeness.
   *
   * To all three of them applies:
   * @param modulePath: A fully resolved path to the module that is to be
   *                    imported. It should be returned by .resolve() and be
   *                    requested by .load().
   *
   * Basically:
   * 1. .load() receives the module name,
   * 2. resolves it using .resolve()
   * 3. executes it (either depending on file suffix or other method)
   * 4. and finally, .load() returns the requested variables as a map.
   */
  public static func runScriptModule(modulePath) { /* ... */ }
  public static func runBinaryModule(modulePath) { /* ... */ }
  public static func runBytecodeModule(modulePath) { /* ... */}
}

This would also allow a JSON-processor, so we could import specific keys from JSON.

import name, version from "./package.json";

And, we would get to redefine functions too - if the embedder or user really wants to mess this much with the module system. Appending to the module path would also be possible at any time.

@IngwiePhoenix
Copy link
Contributor

I just saw @Jezza's comment about Lua's require system - and although I think about NodeJS' require() first, this is not that much of a bad idea either.

A module system can be implemented without changing the syntax of the language at all. Imagine the following call:

var audio = require("audio");

What require() would do is to call Module.load(...) (from my example above). The change would be, that imports would not be specified and instead, the whole file would be loaded and executed in a separate fiber, with an exports object being extracted off that fiber, stored in a gravity_hash_t where the keys represent the resolved modules, and returned to the caller.

That would be very much a non-syntax changing API, but only requires for the possibility of extracting values off one Fiber and transferring it over to another.

func getExports(fullFilePath: string) {
  var f = fiber.createFromFile(fullFilePath);
  var fibReturn = f.runAndWait();
  return fibReturn.exports;
}

// Within a file given to that function:
exports["foo"] = "bar";

// within a caller:
var myModule = require("./other.g");
System.print(myModule["foo"]); // "bar"

However, this would quite obviously not be a very user-friendly syntax, since dot-notation for this would be nicer, which a map does not offer. :) But still, this is very possible.

@Xandaros
Copy link

I personally really like how rust handles modules. It's a static language, so I'm not sure how well this will be received, but:

You basically build a module tree. The entire tree will be available in the global namespace. For example, if you have a module a, a will be in the global namespace. If a has a submodule b, you can reach it through a: a::b.

While this doesn't allow dynamic loading of modules, which might still be wanted, it does allow modules to always be accessible. Circular imports are also not a thing, every module can use every other module. (Well, rust also has visibility, let's just assume everything is public)

Especially for bigger and closely coupled projects, avoiding circular imports becomes a bit of a chore. Not having it be an issue at all is very nice.

@IngwiePhoenix
Copy link
Contributor

Hey hey :)

Been a while since I last checked in with Gravity - and after checking the docs, I noticed that not a whole lot seems to have changed. So, I decided to check the issues and saw this open.

What is the current state on the import statement - and, module loading in general?

Thanks in advance! :)

@SarahIsWeird
Copy link

SarahIsWeird commented May 2, 2021

Hey! It's been a bit since this had some activity in it, so I'm currently working on this as a proof-of-concept.

Personally, I like what @marcobambini proposed, using import as syntactic sugar for System.import. I think it's beneficial for a library to decide on what it's going to export, so we could have an export() function which could return what it wants. This would be up to the library author. A small example:

In testmodule.gravity

class Test {
    static func sayHello() {
        System.print("Hello from the module!");
    }
}

func export() {
    return Test;
}

In main.gravity

var Test = System.import('testmodule');

func main() {
    System.print("Hello from the main file!");
    Test.sayHello();
}

I've actually implemented this, already. You can check it out on my fork of this repo. Natives work too, but only on linux. The problem I'm having right now is not polluting the global namespace with unwanted junk from the library itself, but it's a concept right now anyways.

I think import should search for files in the PATH (or maybe even GRAVITY_PATH?) and look for somelib.gravity for source files, somelib.g for pre-compiled files, and somelib.gso for native files (I think @IngwiePhoenix's naming is spot-on) and be able to load from all the directories, preferring files in the current directory.

Speaking of native libraries: In my opinion, libffi isn't needed (but could be implemented with a native library as a bridge). import could just call a predefined function from the library (I like gravity_export), which would serve a similar role to the register function for the optional classes, only that it'd return a gravity_value_t*. This would be to make it behave the same way as an import of a gravity source file. @marcobambini: Does a newly created class necessarily need to be initialized with the vm pointer? If not, we wouldn't even need the gravity_vm *vm parameter.

If people (read: Marco) aren't interested in importing anymore, I'd make it my own project.

Edit: In my test you can't use import for System.import. I don't know enough about compilers to make that work.
Also, if we get this done, we can actually work on the package manager!

@IngwiePhoenix
Copy link
Contributor

Hello there @SarahIsWeird !

I have been checking Gravity's commit log on and off, hoping to spot the introduction of the import statement. But... to no avail. I am still very interested in this, especially after getting quite deep into the V language. Currently, I am working on making SWIG do some important stuff for this (downgrade a C++ API to a plain C API and expose that to V - as well as building on Windows via CMake).

I would love to use Gravity in my projects! So, yes, I am still very interested in the statement and features alltogether :) For a project I am working on I was considering to use JerryScript, but it is quite big and might actually be more overhead than what the project needs in terms of scripting. Gravity would work well there.

I am going to check out your fork and see how it performs and how you implemented it in detail! Hopefuly, "native" import will land eventually :)

Also thanks for the ping; I might've otherwise missed this post. ^^'

Kind regards,
Ingwie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests