Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement of code metadata #11

Open
SicroAtGit opened this issue Oct 2, 2019 · 8 comments
Open

Improvement of code metadata #11

SicroAtGit opened this issue Oct 2, 2019 · 8 comments

Comments

@SicroAtGit
Copy link
Owner

I'm thinking about a new format for the code metadata.


The current format is too complicated:

  • One file code:

    CodeName[Win].pb, CodeName[Win,Lin].pb, CodeName.pb (all OS)

    Code file header:

    ;   Description: 
    ;            OS: Windows, Linux, Mac
    ; English-Forum: 
    ;  French-Forum: 
    ;  German-Forum: 
    ; -----------------------------------------------------------------------------
    
  • More files code:

    CodeDirectoryName[Win], CodeDirectoryName[Lin,Mac], CodeDirectoryName (all OS)

    CodeDirectoryName/CodeInfo.txt:

    ;   Description: 
    ;            OS: Windows, Linux, Mac
    ; English-Forum: 
    ;  French-Forum: 
    ;  German-Forum: 
    ; -----------------------------------------------------------------------------
    

How I imagine the new format at the moment:

  • One file code and more files code:

    CodeDirectoryName/PackageInfo.json:

    {
      "packageDescription": "This is an example package",
      "packageAuthors": "Author1, Author2",
      "packageVersion": "1.0 Beta 1",
      "packageUpdateDate": "02.10.2019 16:00",
      "packageLicense": "MIT License",
      "windowsSupport": true,
      "linuxSupport": true,
      "macSupport": true,
      "forumThread_DE": "https://www.purebasic.fr/german/...",
      "forumThread_EN": "https://www.purebasic.fr/english/...",
      "forumThread_FR": "https://www.purebasic.fr/french/...",
      "files": [
        {
          "fileName": "Logo.png",
          "authors": "Author1, Author2",
          "version": "1.0",
          "updateDate": "01.10.2019 10:30",
          "license": "CC-BY-4.0"
        },
        {
          "fileName": "Music.mp3",
          "authors": "Author",
          "version": "2.0",
          "updateDate": "04.09.2019 14:18",
          "license": "CC-BY-4.0"
        }
      ]
    }

As you can see, the new format now also supports metadata for non-code files.

The JSON can easily be read into a structure:

Structure CodeInfoFilesStruc
  fileName$
  authors$
  version$
  updateDate$
  license$
EndStructure

Structure CodeInfoStruc
  packageDescription$
  packageAuthors$
  packageVersion$
  packageUpdateDate$
  packageLicense$
  windowsSupport.i
  linuxSupport.i
  macSupport.i
  forumThread_DE$
  forumThread_EN$
  forumThread_FR$
  List files.CodeInfoFilesStruc()
EndStructure

[...]

ExtractJSONStructure(JSONValue(json), @codeInfo, CodeInfoStruc)
@tajmone
Copy link
Contributor

tajmone commented Oct 3, 2019

Ciao @SicroAtGit,

I'm thinking about a new format for the code metadata.

The current format is too complicated:

I agree. Also, I think most programmers aren't willing to abide to enforicing a strict notation into their sources comments — each programmer has his/her own coding style, and can get fuzzy about having third party tools force them to tweak their sources.

My personal opinion is that the ideal solution is to wrap each "module" (wether it's a single file or a collection of files) into an individual folder, and to have a single package description file per folder.

The package description file could be in JSON or YAML (the latter is better, but unsuporrted by PB unfortunately).

Furthermore, I'd devise the package entries in view of a future package manager, so that when the time is ripe to create a real package manager capable of downloading thir party packages from the Internet (based on the user's OS and PB version) adopting it it won't require rewriting everything from scratch.

Personally, I suggest looking at some already existing package manager, like Nimble, the official package manager for the Nim lang (which is open source, and you could just adapt it for PB, with the benefits that it's constantly updated by Nim creators).

I know that you'd rather implement the package manager fully in PB, but I'd advise you against it, for various reasons. First of all, the native design of PureBasic makes it hard to work with version control — from alternative ways of storing IDE and source info (in the source, in the folder, etc.) to the fact that PB UTF8 sources use the BOM, and many others. Furhtermore, being a closed source language you won't be able to exploit continuos integration features (which are essential in package managers).

There's nothing wrong in using another language for a package manager. Nim has the great advantage that it doesn't require complex Makefiles nor Shell scripts in its build toolchain (everything is done via Nim script and configuration files, which are also in Nim code).

Furthermore, Nim is capable of compiling targetting specific compilers, including MSVC 2010 using C — which means that you can create a PureBasic library in Nim. This is an important feature, for PB won't accept libraries for Windows created with other MS versions, so you're going to face problems with C99 (since MS compilers before MSVS 2013 are only C89 comptabile).

Also — let's face it! — when it comes to network operations security is important, and PB network functionality is not only very limited but also based on very old third party libraries which it's unlikely they've been patched for the latest security threats. Not to mention interfacing with Git and GitHub, or handling Semantic Versioning (for which PB offers no libraries at all).

PureBasis was never designed with user contributed packages in mind, so many of its design choices makes it difficult to take the community-driven path of sharing packages that would work as expected across different PB releases and OSs.

In any case, even if you persist wanting to implement the package manager in PB, try at least to look at some good package managers for inspiration regarding the package info format, its entries, etc.

I'm convinced that instead of asking users to contribute their code to a central repository it would be much better to allow them to run their own repository, independently, and submit their package to a central directory instead. The current system forces them to go through a central authority to update their own code, which far from ideal. If this project was able to handle third party code via Git repositories it would be much better.

@SicroAtGit
Copy link
Owner Author

Sorry for not responding to your answer for so long. Currently, I don't have enough free time for the project. It will continue shortly. Thank you very much for your help.

@SicroAtGit
Copy link
Owner Author

My personal opinion is that the ideal solution is to wrap each "module" (wether it's a single file or a collection of files) into an individual folder, and to have a single package description file per folder.

This is how I planned it and tried to describe it above.

The package description file could be in JSON or YAML (the latter is better, but unsuporrted by PB unfortunately).

I like YAML, too. Easier to read/write by humans. No curly brackets. Support for multi-line strings and comments. But as you have already mentioned, PB doesn't provide a native library for YAML.

Actually JSON is already too powerful for my purpose. I already thought about whether the INI format would be enough. PB can handle this format with the native preference library. Later, I realized that I need arrays (for authors and dependencies) and I don't want to imitate them with commas inside key values. Therefore, I stick with JSON.

Personally, I suggest looking at some already existing package manager, like Nimble, the official package manager for the Nim lang (which is open source, and you could just adapt it for PB, with the benefits that it's constantly updated by Nim creators).

I have looked at the package information from several package managers to decide which package information I want to include.

The package manager Nimble of the programming language Nim uses build scripts instead of simple package information. Currently, I tend to use only package information. The only thing I can imagine at the moment are the IDE tools that could be automatically compiled and inserted into the IDE. But for that a simple call of the PB compiler and code inside the package manager is probably sufficient.

In addition, Nimble requires that a separate Git repository be created for each package. The only way I see to avoid this would be to use orphaned branches within a repository, each containing only one package.

Downloading only specified directories of a repository is made possible by a new Git feature, but it's not yet supported by GitHub, so as far as I know, it's currently only possible via the GitHub API.

I know that you'd rather implement the package manager fully in PB, but I'd advise you against it, for various reasons. First of all, the native design of PureBasic makes it hard to work with version control — from alternative ways of storing IDE and source info (in the source, in the folder, etc.) to the fact that PB UTF8 sources use the BOM, and many others. Furhtermore, being a closed source language you won't be able to exploit continuos integration features (which are essential in package managers).

I will think about it. Writing the package manager with Nim would be a good entry project to learn the new programming language practically. The contributors then have to learn a new programming language to help, but to always stick to only one programming language (PureBasic) is also wrong, I agree with you.

I'm convinced that instead of asking users to contribute their code to a central repository it would be much better to allow them to run their own repository, independently, and submit their package to a central directory instead. The current system forces them to go through a central authority to update their own code, which far from ideal. If this project was able to handle third party code via Git repositories it would be much better.

I've thought about that. There are pros and cons. This is also a topic that is part of the upcoming milestone, and therefore there will be a separate issue for it shortly. I will then go into it in more detail there.

I thought about the package information again. The above mentioned PackageInfo.json is too detailed. Listing each file and naming its license, version etc. is too much.

This is how it looks now:

{
  "description": "Description of the package",
  "version": "SemVer 2.0.0",
  "license": "SPDX-License-Identifier",
  "type": "include or tool",
  "forums": {
    "de": "https://www.purebasic.fr/german/thread",
    "en": "https://www.purebasic.fr/english/thread",
    "fr": "https://www.purebasic.fr/french/thread"
  },
  "authors": [
    "Author1",
    "Author2"
  ],
  "support": {
    "windows": true,
    "macOS": true,
    "linux": true
  },
  "dependencies": [
    "IncludeName1",
    "IncludeName2"
  ]
}

Above I mentioned that I could imagine that the future package manager could automatically compile IDE tools and insert them into the IDE. This would be the case if the type is tool.

There is no name key that specifies the package name, because the directory name is the package name.

I don't want to change the directory structure into a flat structure (no subdirectories for includes), because I want to be able to see to which category/subcategory the include belongs without having to look into the PackageInfo.json file — assuming the category/subcategory is named in this file.

@tajmone
Copy link
Contributor

tajmone commented Feb 2, 2020

Git Sparse-Checkout

Downloading only specified directories of a repository is made possible by a new Git feature, but it's not yet supported by GitHub, so as far as I know, it's currently only possible via the GitHub API.

I didn't realize that sparse-checkout allows filtering folders and files, I've been using it on Travis CI just to limit the commits history depth but had not clue it could also be used this way.

This is an important feature indeed, so I can't thank you enough for pointing it out to me! As for its usability in this project, it might not be mature enough and the risk is that many Git front-ends might not support it, creating problems with end users. Maybe in the future it will be better supported by third party Git apps and GitHub.

YAML Alternatives: TOML

I like YAML, too. Easier to read/write by humans. No curly brackets. Support for multi-line strings and comments. But as you have already mentioned, PB doesn't provide a native library for YAML.

TOML is a light-weight alternative to YAML, which is much easier to implement:

We might consider writing a TOML parser in PB, or binding a C implementation.

Definitely, the storage format is important — and we wouldn't want to find ourselves having to switch format after bumping into limitations in a year time. The adopted format should be solid, flexible and human-friendly.

PureBasic Preferences are a good format, but not enough for a package manager in my opinion — as you mentioned, there are many potential uses that could be added in the future, like auto-compiling IDE tools, etc., which would require a robust standard that allows nested settings.

JSON does cover most needs, and it's well supported by PB, but I'm personally not a great fan of JSON, it's still hard to read, and doesn't allow comments lines or blocks (you need to add a "comment" entry).

In any case, switching from JSON to TOML/YAML in a later stage wouldn't be a huge deal, as long as the JSON format is still supported for backward compatibility. So it might be good to start off with JSON (and prevent huge delays), but keep the door open to the possibility of adding TOML/YAML in a second stage.

Design Thoughts

Script vs Data Info

The package manager Nimble of the programming language Nim uses build scripts instead of simple package information. Currently, I tend to use only package information.

Yes, Nim has this cool thing of being both a compilable language as well as a scripting language that runs in the Nim VM, which is then exploited for build scripts, settings files, and dynamic pre-procosser functionality.

Of course, the project package manager could rely on Nim being present on the machine, but this would have a bad impact on many users (although Nim is really easy to setup, requiring just to unzip a folder and add it to the PATH).

But we should consider that we could just use PB code as if it was a script, i.e. by invoking the PBCompiler to execute it. Package scripts could be either stored in the package using a custom extension (e.g. .pbrun) or by storing the actual source code in the package data — as a huge JSON string, by escaping it, or in base64 using the PB libraries to serialize/deserialize the code; of course in YAML storing source code is simpler and doesn't require escaping.

Anyhow, it's possible to use PB code in the packages by adopting various solutions, so I think that this possibility needs to be considered before hand, to make sure the specs leave room for its future implementation.

VCS Tools

In addition, Nimble requires that a separate Git repository be created for each package. The only way I see to avoid this would be to use orphaned branches within a repository, each containing only one package.

I though this was the idea, i.e. to have packages as repositories (or GitHub Gists), and then include them as Git submodules, which allows you to control in the Archiv which commit to checkout for each submodule (and prevent unpleasant surprises, in case the upstream adds malicious code to the repo).

In general, the trend of all modern package managers is to rely on repositories via tools like Git (Go, for example, uses Git, Mercurial and even Bazaar) and expect each package to be a stand-alone repository.

Personally, I didn't like the fact that Go uses so many different VCS tools, because I coudln't get hold of some packages without installing Bazaar (which wasn't worth the effort).

Also, I wonder if using Git is the right approach. The PB community is not responding well to Git, with a small number of users willing to struggle with its steep learning curve, and with Git being a huge package on OSs like Windows.

Fossil, on the other hand, is a much more powerful and modern tool than Git, and ships as a single stand-alone binary on all OSs, and doesn't require all the external POSIX tools that make Git not fully compliant on Windows. Also, Fossil repositories include a full Wiki for documentation, and Fossil supports multiple concurrent checkouts, allowing to work in different folders with different checkouts and branches (or the Fossil equivalents of Git branches).

ChiselApp is a free Fossil service, a bit like GitHub. But Fossil can run as a full server natively, no need to install other tools, just launch it via the CLI. Surely, on ChiselApp you don't have all these amazing services like Travis CI, Circle CI, etc., but it does contain all the extra tools added by GitHub (wiki, issues discussions, etc.) and all this is in a single binary file. There's also an official GUI frontend, and it's easier to learn than Git in my opinion.

Programming Lang

Writing the package manager with Nim would be a good entry project to learn the new programming language practically. The contributors then have to learn a new programming language to help, but to always stick to only one programming language (PureBasic) is also wrong, I agree with you.

I'm not sure I'd worry too much about contributors, for these seem to scarce anyhow, so I'd rather focus on choosing a language which is powerful and (most of all) fully open source — so we can use CI tools for its development and testing, which can't be done with PB.

PB lacks to many features when it comes to network security, and the MSVC 2010 limitation prevents binding to C99 libraries under Windows, not to mention being unable to bind C++ libraries — all of which make it hard (or impossible) to exploit existing libraries for YAML, LibGit, etc., which ultimately means we'd have to either give up some cool features or implement them from scratch in PB (which would take ages).

Unfortunately package managers are complex beasts, so it's hard to imagine creating one without third party libraries. Also, security is an issue here, and with many PB libaries lagging years behind I'm not comfortable using PB to carry out network operations that might involve passwords and security tokens.

Just imagine how hard it would be to exploit the GitHub API in PB, having to write everything from scratch due to C interfacing limiations. Not feasable.

But, we need to consider carefully the impact of those OSs that have dropped 32-bit support, like macOS and iOS. For example, Rust has already announched that it will no longer support 32-bit relases for Mac, and I expect other languages will soon follow along this path.

Ubuntu (and other Linux distros) are also going to soon drop 32-bit support. I'm not sure what's going to happen to PB in this regard, i.e. whether it will also drop 32-bit editions for Mac and Linux, but we must be aware that if we use other languages (Nim, Rust, etc.) these might soon be no longer available in 32-bit, so we might not be able to compiler 32-bit binaries for macOS and Ubuntu.

I'm surpised that this big change isn't being discussed much on the PB Forum. On the IDE repo, it was dismissed as something not impacting PB much, but I seriously doubt that this will be the case — I mean, how are the PureLibs based on 3rd party components going to be updated if the original authors stop developing their 32-bit version for Mac and Ubuntu?

Closing Thoughts

I think that the package manager needs brain-storming and jotting down a list of all the things that it should/might cover, discuss their pros and cons, and start drafting a Technical Specifications document, so we have a clear road map of where we need to go, and decide which tools and languages will serve the purpose best.

@SicroAtGit
Copy link
Owner Author

SicroAtGit commented Feb 22, 2020

Downloading only specified directories of a repository is made possible by a new Git feature, but it's not yet supported by GitHub, so as far as I know, it's currently only possible via the GitHub API.

This is an important feature indeed, so I can't thank you enough for pointing it out to me! As for its usability in this project, it might not be mature enough and the risk is that many Git front-ends might not support it, creating problems with end users. Maybe in the future it will be better supported by third party Git apps and GitHub.

My thought in this context was that there should also be the possibility to manage multiple packages in one repository, like it is in this repository and in the repository from @Hoeppner1867.

So the archives can still be downloaded as zip packages without the need for a package manager. That's why I mentioned above that I don't want to have a flat directory structure (at least not as long as there is no package manager), because otherwise it would become confusing to navigate through the directories if there are no category/subcategory directories. Cross-package changes are also done quickly with one commit, instead of having to repeat the commit in every package repository, which should have this change.

The problem with this variant was that Git could not only download certain directories from a repository. But with the new feature of Git, that works now. But the next problem is that GitHub does not yet support the new Git feature.

Alternatively, the download could be done via the GitHub API, which would additionally not require an installed Git program, because then only a JSON parser and a URL downloader would be required.

The GitHub API variant requires multiple recursive requests to download all contents of the package's subdirectories as well. It's also a GitHub-only solution — bad if someone makes their package available from another host provider (e.g., GitLab).

Extracting the packages (and their commits) from the CodeArchive and creating a separate repository/gist for each package would be the cleanest way, but I'm still not sure if I want to take this step — @Hoeppner1867 would have to participate as well in this case, otherwise it doesn't make much sense.

In any case, switching from JSON to TOML/YAML in a later stage wouldn't be a huge deal, as long as the JSON format is still supported for backward compatibility. So it might be good to start off with JSON (and prevent huge delays), but keep the door open to the possibility of adding TOML/YAML in a second stage.

TOML looks very good, I take that. I don't want to implement two parsers (JSON, for backward compatibility, and TOML) in the package manager, so I will use TOML immediately.

Of course, the project package manager could rely on Nim being present on the machine, but this would have a bad impact on many users (although Nim is really easy to setup, requiring just to unzip a folder and add it to the PATH).

Yes, actually I want to avoid that the package manager requires too many third-party tools.

But we should consider that we could just use PB code as if it was a script, i.e. by invoking the PBCompiler to execute it. Package scripts could be either stored in the package using a custom extension (e.g. .pbrun) or by storing the actual source code in the package data — as a huge JSON string, by escaping it, or in base64 using the PB libraries to serialize/deserialize the code; of course in YAML storing source code is simpler and doesn't require escaping.

Yes, I also had that thought — a file within the package that contains PB code and serves as a package installer script (e.g., compile PB IDE tool and automatically set it up in the PB IDE tool management).

I have already gone so far in my mind that I have taken PB code as package information:

#Description         = "Description of the package"
#Version             = "1.2.3.4"
#Date                = "2020-01-24"
#License             = "MIT"
#Authors             = "Author1, Author2"
#Support_Windows     = #True
#Support_MacOS       = #True
#Support_Linux       = #True
#Support_SpiderBasic = #False
#Dependencies        = "PackageName1, PackageName2"
#Forum_Thread_DE     = "https://www.purebasic.fr/german/thread"
#Forum_Thread_EN     = "https://www.purebasic.fr/english/thread"
#Forum_Thread_FR     = "https://www.purebasic.fr/french/thread"
; End Of Package Information

CompilerIf #PB_Compiler_IsMainFile
  
  ; Installation code of the package
  
CompilerEndIf

A very simple PB parser would then process the constants until it reaches the comment:

; End Of Package Information

So it would be package information and package installation script in one file. But storing the package information in PB constants would be as inflexible as INI files. So I have discarded this variant. But a PB code file only for package installation is certainly a good idea.

I though this was the idea, i.e. to have packages as repositories (or GitHub Gists), and then include them as Git submodules, which allows you to control in the Archiv which commit to checkout for each submodule (and prevent unpleasant surprises, in case the upstream adds malicious code to the repo).

The submodule variant would be nice. Then I could save myself the creation of such a listing of all packages:

As you mentioned, I will be better able to see when malicious code creeps into the external package repositories. This also worries me, because most people will install the packages and then use them right away, instead of carefully checking all the codes within the installed package for malicious code first.

But here we have again the problem that currently not every package has its own repository/gist. Currently, this is an advantage because there is no package manager and all packages can be downloaded as a ZIP package at once.


The first step I want to do is to extract the package information from the PB code files and the CodeInfo.txt files into PackageInfo.toml files. I will also remove the operating system information from the directory names and code filenames of the packages. Furthermore, I will also create a package directory for packages that consist of only one code file.

The second step, for which I will create a new issue, will be about:

  • Whether we also need such a packages file in the root directory which lists all packages:
  • Whether we want to keep the nested directory structure, which represents categories and subcategories, or whether we switch to a flat directory structure so that every package directory is in the root directory, as it is with the @Hoeppner1867 repository. In the second case, we could place the category and subcategory in the package information — in this case we have the alternative of using tags instead of categories.
  • Whether we want to switch from multi-package repositories to single-package repositories.

The third step, for which I will create a new issue again, will be about:

  • Whether we take Nimble and adapt it for this repository or create a new package manager from scratch with Nim or PureBasic
  • What features the package manager should have

Currently, I imagine the PackageInfo.toml file like this:

description = "Description of the package"
version     = "SemVer 2.0.0"
license     = "SPDX-License-Identifier"
type        = "include or tool"

websites = [
  "https://www.purebasic.fr/german/thread",
  "https://www.purebasic.fr/english/thread",
  "https://www.purebasic.fr/french/thread",
  "https://www.my-website.com"
]

authors = [
  "Author1",
  "Author2"
]

[support]
windows     = true
macOS       = true
linux       = true
spiderbasic = true

dependencies = [
  "PackageName",
  "PackageName#1.0.0",  # tag name
  "PackageName#1343866" # commit hash
]

@tajmone
Copy link
Contributor

tajmone commented Feb 22, 2020

Directory Structure

Whether to keep the current directory system of sub-foldering by category, or instead opt for a flat folder system and use tags to handle categories is a crucial decision (if not the most crucial one)!

This is the core dilemma of Information Architecture, which is the science that allows organizing book libraries and their catalogues.

A book as a single physical collocation in the library, but can be placed in many different categories in the catalogue. Software packages are similar to books in many ways, but different in others.

Subfoldering: Pros and Cons

The problem with subfoldering by category is that you'll eventually stumble in a package belonging to multiple categories, so you'll need to arbitrarily choose under which category/folder to locate it (at the expense of the others).

So, if on the one hand subfoldering makes navigation of the repository easier, it can soon become confusing (if not deceiving) when faced with packages that could belong to more than one category in their own rights (e.g. a multi-functional tool).

Surely, one could use symlinks to mirror the package in all the required folders, but this has some disadvantages — creating symlinks requires running the package manager with elevated priviledges, and not all editors can handle symlinks well.

Flat Directory and Tags: Pros and Cons

On the other hand, using a flat directory system and rely on tags for categories lift away the above mentioned problems, but makes packages discovery more difficult without a tool that is able to list packages by categories.

This system works well with books because each book as a unique ISBN number, so even if two books had the same title and both their authors were same-named, you'd still have a different ISBN to distinguish them.

Since same-named packages are not as rare as one might thing, the usual solution in package managers is to user the author ID as the root folder of the package. So we could have both the tajmone/RegEx/ and Sicro/RegEx/ packages without conflicts — i.e. assuming the user ID is unique because it's the real user ID of a service like GitHub. Of course, if multiple services are supported then we might have a problem, because UserX on GitHub and UserX on Bitbucket might be two different authors altogether.

Anyhow, the NPM package manager adopts this system to store packages locally, and relies on GitHub as the source of its packages.

Adding the user ID isn't exactly a totally flat solution, as in involves an extra folder for the author, but at least it uses a fixed directory depth.

Conclusions

In any case, if subfoldering by category was to be adopted, I think that it would be better to adopt a fixed depth systems (e.g. 3 levels max) — even though in this repo we did manage to handle subcategories smartly by looking for the presence of certain files indicating if a subfolder was a project or another category.

I tend to prefer the flat folder approach, but I realize that organizing contents by author ID might be necessary, especially in view of package forks where a third party might maintain a slightly tweaked version of another package.

Packages Lists

Adopting a central list of all packages is quite common in package managers.
It also allows users to add their own package to the central list via a pull request, simplifying the work for the package manager maintainers, but at the same time requiring their approval to enlist the new package.

Another benefit of adopting centralized lists is that it allows end users to create alternative packages lists, and if the pack-manager allows overriding the central list URL, then it becomes easy to switch to different lists servers or use multiple packages sources.

IMO, this feature would make the pack-manager more appetizing to end users, who could see it as a universal tool that can be used independently of the "official" package list.

E.g. Asian or Russian PB users might create their own packages directory, with all descriptions translated to their native language, possibly hosting some packages which are of interest only to native speakers. So, a Russian user might add the Russian packages list on top of the official one, or just use the former.

Many package managers allow this kind of overriding.

General Thoughts

There are many "this-versus-that" points that keep popping up when discussing the package manager. E.g. using only Git vs many VCSs; hosting all packages in a single repo (via submodules) vs using a JSON list pointing to different repositories; and so on.

Many of these dilemmas don't have a clear cut answer, and much depends on whether and how the PB community would actually contribute to the packages list. These are not easy choices, especially since currently there's a very small PB projects presence on GitHub, and mostly are single-contributor projects too.

I tend to think to it might be better to approach this project one step at the time — i.e. while having a clear ultimate goal, approach it gradually. Although this would imply migrating from one system to another, it might make more sense, especially if migration is planned ahead and can be a smooth transition.

I.e. I suggest giving priority to those choices that are certain, start working with them, and then wait and see — so that the more problematic choices can be postponed to when there are more real use-case scenarios, and (most of all) when we start seeing a considerable user base of the package manager, and can get some real feedback.

Let's not forget that besides all these important considerations on the package manager design side, there are also a number of unresolved issues relating to version controlling PB projects — with PB being still rather hostile to version control due to its native IDE settings and a lack of VCS-friendly alternative tools and standards.

Not to mention all the problems relating to PB releases, and how packages should relate to them in terms of compatibility and testing — we can't rely on CI services to test that old code works well with new PB versions, due to the PBCompiler being proprietary, and things are further complicated by the need to test code across all three OSs and architectures (not to mention SpiderBasic compatibility for packages that also target SB).

I doubt that it would be possible to build a package manager that can address all these issues at once in it first incarnation. It's more likely that this is going to be a long process, where each step has to face new challenges and propose new solutions, which once adopted can pave the path for the next step.

I personally don't see it as a problem if the whole project would have to undergo various versions changes, even if it meant breaking backward compatibility and changing standards — as long as the process gathers real following, and the community starts to use it, then I'm sure that time will tell and the whole project will find its own way and shape as it grows naturally.

So, in this respect, I dare say that "simpler is better" and "less is more".
The current project has already reached some important goals, and led to various hands-on experiments, which has taught has some important lessons (especially in terms of which paths we shouldn't take). Now it's time to find a balanced solution that could gain following by end users, and make it easy and safe to use. We can already benefit from some of the precious lessons learned on the PB IDE project, regarding code conventions and how to enforce them, which is a good starting point.

@SicroAtGit
Copy link
Owner Author

I am currently writing the TOML lexer in PureBasic by using my lexer module. Then I will start writing the TOML parser. I think these are the safest tasks that can be done at the moment.

@tajmone
Copy link
Contributor

tajmone commented Apr 11, 2020

I am currently writing the TOML lexer in PureBasic by using my lexer module.

This sounds really cool. Keep me updated on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants