Skip to content

salotz/rfcs

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Request For Comments (RFCs)

Proposals

001: Use Org Mode

nexp
rfc:salotz.001_use-org.org

Proposal: ./rfcs/salotz.001_use-org.org

Abstract:

Use org-mode LOL.

002: Semantic Changelog

nexp
rfc:salotz.002_semantic-changelog.org

Proposal: ./rfcs/salotz.002_semantic-changelog.org

Abstract:

A mini-format for git commit messages that supports machine readability (without sacrificing human readability) semantic expression of impacts of code change.

003: RFC Specifications

nexp
rfc:salotz.003_rfc-specs

Proposal: ./rfcs/salotz.003_rfc-specs.org

Abstract:

Specifications for necessary information for an RFC and how to format it.

004: Name Expressions (nexps)

nexp
rfc:salotz.004_nexps

Proposal: ./salotz.004_nexps.org

Abstract:

A proposal that defines how to name digital document entities: e.g. format:namespace.field-1_field-2

005: Syxel Buffers Format

nexp
rfc:salotz.005_syxel-buffers

Proposal: ./rfcs/salotz.005_syxel-buffers.org

Abstract:

Format that utilizes a granular unit of ‘syxel’ (symbolic pixel) in a rectangular plane similar to a pixel based image format. Syxels are basically character points.

006: Codetags

nexp
rfc:salotz.006_codetags

Proposal: ./rfcs/salotz.006_codetags.org

Abstract:

Tags that are added in comments to code that add semantic meaning to otherwise freeform comments, making them searchable by machine and available to tooling.

007: Python Library Repo Layout

nexp
rfc:salotz.007_py-lib-layout

Proposal: ./rfcs/salotz.007_py-lib-layout.org

Executive Summary:

A layout and content specification that follows best practices that strives for effectiveness rather than complete compatibility.

008: Linux Shell Startup Indirection Layer

nexp
rfc:salotz.008_linux-shell-startup

Proposal: ./rfcs/salotz.008_linux-shell-startup.org

Executive Summary:

A collection of script files which add a layer of indirection to standard linux shells (i.e. POSIX sh and bash) that allows for a more modular, composable, and semantic organization.

009: Repo Layout Templates

nexp
rfc:salotz.009_repo-layout

Proposal: ./rfcs/salotz.009_repo-layout.org

Executive Summary:

A template format for generating and specifying repo layouts in directories, i.e. cookiecutter.

010: Growth Mindset Versioning

nexp
rfc:010_growth-versioning

Proposal: ./rfcs/010_growth-versioning.org

Executive Summary:

Version formatting that doesn’t lie.

011: Writetags

nexp
rfc:011_write-tags

Proposal: ./rfcs/011_write-tags.org

Executive Summary:

Tags (see rfc:salotz.006_codetags for a similar proposal) that are added in comments to prose (documentation, whitepapers, technical documents, etc.) that add semantic meaning to otherwise freeform comments, making them searchable by machine and available to tooling.

012: Errors as Information

nexp
rfc:012_information-errors

Proposal: ./rfcs/012_information-errors.org

Executive Summary:

Errors in information systems, like event logs, should take on a role of providing information rather than being categorized based on their operational importance as is the case in most languages (e.g. checked vs. unchecked exceptions). Inspired by Stuart Holloway and specifically this talk: https://youtu.be/oOON–g1PyU?t=965

013: Code Targets

nexp
rfc:013_code-targets

Proposal: ./rfcs/013_code-targets.org

Executive Summary:

Code targets are a specification for a syntax which should be recognized by metaprogramming tools like template engines, editors, etc.

014: Project & Maintainence Intentions for OSS

nexp
rfc:salotz.014_project-intent

Proposal: ./rfcs/salotz.014_project-intent.org

Executive Summary:

A proposal for a best practice that includes a clear statement of intent on open source projects that would allow for greater understanding by consumers about the current state and intentions of developers in fairly unambiguous terms. Extensions could provide optional semantic vocabularies for very clear intent and to provide a set of choices for developers so they don’t have to think as much from scratch.

015: Collaborative Line-Oriented Plaintext Document Editing Protocol

nexp
rfc:salotz.015_collab-line-editing

Proposal: ./rfcs/salotz.015_collab-line-editing.org

Executive Summary:

A protocol and format to facilitate low-friction editing of line-oriented document formats like LaTeX, Markdown, Org-mode, etc.

016: Nearly Trivial Plaintext Formats

nexp
rfc:salotz.016_trivial-plaintext-formats

Proposal: ./rfcs/salotz.016_trivial-plaintext-formats.org

Executive Summary:

A small collection of nearly trivial plaintext formats along with file extensions. Includes things like a line based list format.

017: Bunker: User De-Militarized Zone

nexp
rfc:salotz.017_bunker

Proposal: ./rfcs/salotz.017_bunker.org

Introduces the concept of a bunker directory for user only data in their respective $HOME directories which is simply .${USER} or .${USER}.d. This is to allow a safe-space for customization and configuration that will be unmolested by other programs since it is unlikely that they will have this hardcoded in their behavior.

018: Secrets Database

nexp
rfc:salotz.018_secret-database

Proposal: ./rfcs/salotz.018_secret-database.org

A specification for a secrets database using GPG and mostly compatible with tools like pass.

019: Meta-Project Protocol (MPP)

nexp
rfc:salotz.019_MPP

Proposal: ./rfcs/salotz.019_MPP.org

Protocol for live-updating of file-based development projects.

020: In-Repo Issue Tracking Schema

nexp
rfc:salotz.020_repo-issue-tracker

Proposal: ./rfcs/salotz.020_repo-issue-tracker.org

Executive Summary:

Schema for including issue tracking sources with a project, without having to rely on outside forges.

021: Org-mode Next-Gen

nexp
rfc:salotz.021-org-mode-ng

Proposal: ./rfcs/salotz.021_org-mode-ng.org

Executive Summary:

Proposals for the next generation of org-mode involving simplifying, modularizing, and making it more extensible and amenable to a wider non-emacs ecosystem.

022: template inheritance

nexp
rfc:salotz.022_template-inheritance

Proposal: ./rfcs/salotz.022_template-inheritance.org

Executive Summary:

Taking inspiration from gitignore.io and their template structure. Specifies a way to layout a repository of templates that allows for: stacks, inheritance, and ordering.

023: Editor RFIs

nexp
rfc:salotz.023_editor-rfis

Proposal: ./rfcs/salotz.023_editor-rfis.org

Executive Summary:

Publish a series of RFI standards for editors that describe the behavior of specific features.

024: Next-Gen Vector Graphics Formats

nexp
rfc:nextgen-vector-graphics-formats

Proposal: ./rfcs/nextgen-vector-graphics-formats.org

Executive Summary:

Specifications for a suite of vector graphics formats, namely providing both plain-text and efficient binary representations suited for different kinds of applications.

Request For Implementations (RFIs)

In addition to merely specifying best practices, protocols, and standards via RFCs it has come to my attention that there is a need for a listing of concrete proposals for actual working software that perform critical functionality.

The purpose here is not to “shout at clouds” but to attempt to actually evaluate and validate the need for the introduction of new software.

In addition to describing the purpose and semantics of the propsed software in-depth analyses of prior art should be done as well with feature matrices showing gaps in the current functionality.

It is unclear about how exactly to incorporate the idea of the ease of integration (or perhaps “civility”) of a piece of software but attempts should be made to take this into account. E.g. is it open-source and licensed in such a way that it could be used in perpetuity.

Name is inspired by the Scheme Request For Implementations (SRFIs).

Version Control for mixed collections of digital assets

Description

Categories of reflection capable for data objects:

“black” objects
opaque binary blobs
“grey” objects
e.g. PostScript files, XML, image formats
  • specialized and customizable diffing functionality
“white” objects
those with well-known (semi-universal) diffing strategies and merge techniques

Ideas:

Beyond decentralized vs centralized

There should be a middle-ground between Decentralized and Centralized VCS which is more similar to federated systems.

The idea is that authority over the “master copy” (centralized vs. decentralized) is controlled (by some means) and can be delegated to a collection of hub servers.

A purely decentralized system (like git) is fundamentally unable to handle black objects because the source of these can only be controlled either through shear replacement (called arbitrary replacement in which no merge strategy is possible and decisions are completely arbitrary and determined by authority) or through locking (change control again completely arbitrary if we are to avoid “wars” between users).

While both arbitrary replacement (arbitration) and change control (delegation) are viable solutions in general for dealing with black objects, they can’t be implemented with git without an extension to the protocol. These protocol extensions are typically performed by the hosting service which controls the “master copy” (like github or gitlab, called forges).

This leads to lock-in and/or a lack of ability to cooperate with others and essentially a centralized, non-distributed paradigm that allows for offline work encoded as a temporary “fork” of authority, despite the “decentralized” moniker.

While any new VCS system should surely support offline-work it should be reified in the protocol, rather than being implemented as a temporary fork.

Furthermore, the usage of the two strategies for blackish objects (arbitration and delegation) should also be reified and be able to be composed.

Here is an example:

Lets say you are an organization that is working on a large product and you have more than one team.

You don’t want the progress of one team to hold up the other one and you would like them to work completely in parallel.

However, you don’t want them to make mutually incompatible changes to objects for which they really should be cooperating with other teams on (particularly to blackish objects which can’t be intelligently merged).

So you split up the entire repo image (the “monorepo” perhaps) into several regions:

team A’s region
region A
team B’s region
region B
shared resources region
region S
innaccessible region (for other teams or administrators only etc.)
region X

This allows for work done region’s A and B to happen quickly and according to the self-organization of the respective teams.

This forbids access to region X which should either be irrelevant parts of the monorepo or sensitive reflective configuration data that should only be controlled by the reigning authority (such as the access control itself).

Region S however both teams can use and merges therein should be performed by an authority.

However, within this shared region we still want to implement locks for blackish objects or implement replacements.

These should then be determined by the overarching authority (for consensus) and as such must be centralized. Remember consensus means slow.

Because we are able to isolate only the few contentious resources both teams must access regions A and B need not gain consensus between them.

So in effect we have a hybrid decentralization and distribution strategy that is allowed by a hierarchy in which authority is delegated to allow for parallel work to be done, while still maintaining consensus.

Specifying Networks

The main idea behind my refugue project is that you can manage your own personal network (PN) while using something like the internet or sneakernet to actually perform the transfers etc. Kind of like urbit.

The point being that at a semantic level the personal network is what matters to an individual or enterprise and not really the internet or what ICANN says about domain names.

This is something I always struggled with because it felt that control of my own network was out of my hands. In reality ICANN and the internet simply enable the underlying transport to take place conveniently and to build my network on top of.

This network really isn’t that complicated and simply relies on some mapping of unique pet names to a collection of addresses which you may find that peer on. Which can be:

  • IP addresses
  • domain names
  • zeronet addresses
  • file paths to mounted volumes

etc.

The same is useful for a distributed/decentralized version control system. And is useful for allowing polymorphism in the data that resides on each peer.

For example I want to check out just a portion of the enterprise monorepo because I either don’t need all of it, can’t hold all of it, or am not allowed to hold all of it.

Specifying a network from a single holder of authority is a way to achieve this.

How consensus on this single point of authority is decided can be customized but can be something like:

  • zookeeper or raft
  • blockchain
  • held in a person i.e. CEO

Prior Art

  • boar
  • git-lfs
  • git-annex
  • subversion

In-Repo Issue Tracking

Description

Prior Art

git-issue

https://github.com/dspinellis/git-issue

Pros:

  • import/export to common forges like github and gitlab
  • implemented in shell scripts

Cons:

  • fixed schema, oriented around small files of undescribed schemas, non-extendable
  • only works with git
  • not editable directly via editor easily:
    • metadata and description spread across many files for each issue
    • file/dir names based on hashes rather than anything meaningful

deft

https://github.com/npryce/deft

Pros:

  • VCS agnostic
  • Text-editor friendly
    • issues live in a renamable dir
    • only 2 files: description + metadata
    • files named after file

Cons:

  • Python 2
  • no forge integration
  • must manually name issues

git-dit

https://github.com/neithernut/git-dit

Requirements

After reviewing the existing options I like the UX of deft the best but it suffers from some implementation issues mainly being implemented in Python 2.

I want my issues to be able to be edited by a text editor as not onyl a secondary means, but as a first class thing.

This doesn’t preclude command line tools for this purpose but using only a text editor should be ergonomic.

Ideally anyhow it would get integrated to emacs or VSCode etc. which is how these kinds of tools work better anyhow.

Single file issues

Each issue should be only a single file.

It is too annoying to switch between different files to tweak and twiddle metadata.

In the limit individual issues could be of different actual formats that allow for varying degrees of semantic data. (similar to git commits).

Issues can be in SOML (my minor TOML variant for writing lots of text) where you can add tags, flags, and metadata as you please alongside long descriptions in different markup languages.

They can also be in other formats as you wish and should be a decision each project should make.

That is the schema/protocol for the directory layout etc. should be orthogonal to the content of the actual issues.

This can only really be achieved with single file issues.

However, there should be affordances for attaching resources (like images etc.) into sidecar directories. But these really should contain any issue writer writings.

Other options for issue formats could be:

  • skribilio
  • org-mode
  • markdown
  • plaintext
  • XML, HTML
  • JSON

Here is an example with SOML:

First line of the issue

[meta]

status = 'open'
assigned = '@salotz'
tags = [
     'feature',
     'bug',
     'critical',
]

[description]

Here is the longer description of the issue describing what to do
about it.

It should allow introspection with links and such.

Here I am linking the path to a piece of code I want to reference:

[[proj:/src/package/__init__.py?line=30;col=12][code here]]

[discussion]

[[comments]]

contributor = '@salotz'

message = [

I think this is a good issue, I can even do this myself.

]

[[comments]]

contributor = '@wumpus'

message = [
I disagree, I don't think its worth our time.
]

Automatic issue naming

Its not always easy to manually name issues. That is why things like github and gitlab use auto-naming things where they use numbers to refer to them.

Tools should support this so that people don’t have to name their issues.

However, to keep things friendly the file names should support adding extra “fields” to add short descriptions.

Examples:

An uncommented issue should just be a number. Issues should increase monotonically and start at 0 and should include 0 padding for sorting of crummy legacy software. All tools should sort numerically.

The zeroth issue:

000000.issue.schema

The next issue:

000001.issue.schema

The issue after with a name issue-key also:

000002_issue-key.issue.schema

Alternate 3rd Party Standard Library for Python

I was looking for at one point an alternative to the standard library.

Here is the example:

the shutil module is only good in 3.8+ because of some basic options in the copytree function, and so it is pretty unreliable and confusing when you are trying to build standard tooling that should use that.

Typically in tooling I would write:

cx.run("cp -r tree ~/scratch/tree")

But this is POSIX only, and using the python standard library should be cross platform, right?

So I change to using shutil:

import shutil as sh

sh.copytree('tree', "/home/salotz/scratch/tree")

This will fail if “/home/salotz/scratch/tree” already exists. And is annoying, so you add the exist_ok option but this only is in python 3.8….

A lot of my tooling uses 3.6 or 3.7 because dependencies in other environments. This is super annoying.

Also there is no features in py 3.8 that make shutil able to gain this super power. Its just an if statement.

For something like this I wish there was just another library I could import to get this behavior on all relevant python versions.

import altstd.shutil as sh

An Non-Standard Standard Library for Python

The problem is that there are loads and loads of utilities that are very useful for doing common patterns or used during development (i.e. during debugging & prototyping etc.), and these are scattered across dozens and dozens of little tiny packages across the internet.

This leads to a lot of issues that are well-known. For example you run into the famous “left pad” problem. Which is where one of these tiny projects becomes unavailable and you lose that dependency and break all your code.

Other problems are that when you get into real projects there is a certain friction associated not only with making sure that these little tools are specified in the env files (i.e. requirements.in etc.) and that every variation of these files gets updated as well.

On top of that you simply have more dependencies that you have to manage and worry about breaking your code when they change something.

My idea is to go around the internet and collect all of these little utilities into a single repository that is tracked as a single dependency.

I would call this the “Non-Standard Standard Library” or the ‘NSSL’.

(Name liable to change)

Some candidates that motivated this:

Diff Collection

A collection of custom diffs that can be plugged into a variety of things like VCS.

Add support for things that aren’t amenable to standard GNU diff like things.

gitignore generator CLI w/ local repo support

The gitignore.io collection is a great resource with a terrible website web API, local server mess used to implement it.

Something much simpler should be devised that simply works with either local repos or from a git repo.

You should also be able to combine different repos together.

For instance I like to use community maintained gitignores for a lot of stuff because then I don’t have to worry about stuff. But I typically end up home-growing my own sets of ignores that I use across many projects and I want to be able to compose them all.

For instance you can make a single gitignore using the query system they have in some command line tools as well:

giig python,emacs

Will give you both python and emacs.

One: I want to be able to specify which repo to use for this.

giig --url git+https://github.com/salotz/gitignore-salotz.git python,emacs

For lets say my personal one. But I also want to be able to compose them. (Do we need this to support that or can I just use piping?)

Perhaps above this would implicit compose that and the default community one, to only get mine I might use:

giig --no-default \
     --url git+https://github.com/salotz/gitignore-salotz.git python,emacs

This is necessary for if there is conflicting names. I.e. both have Python.gitignore in them.

To allow for composition you can use in the secondary repos the name Python.patch or specify some stacks Salotz-Python.Python.stack which is a symlink to Python.stack and infers that we also want to use the Salotz-Python.gitignore template file.

And for instance combining the default repo with a local one:

giig --url file://~/my-ignores python,emacs

Source Code Tree Database & Query Language

I have come across a number of tools which read your source code and then generate some set of quality metrics or other representations of your code that allows you to do high-level planning and understanding about your code base.

(I am primarily interested in Python although this niche is necessary for really any code base.)

All of these tools have to reinvent the same wheel which is generating an extended “AST” which is over the files in your project.

The touchstone for this is my pymatuning which is a good prototype of this, albeit surely incomplete in many ways.

So the proposals for software would then be:

  • reliable general purpose tree data structure in python that allows for standard querying and outputs to various common formats like JSON etc.
  • a standard Python module/package representation using this tree data format.

GNU Coreutils numfmt clone

Should implement this in pure-python for maximum portability.

This tool converts from bytes to higher units correctly and with optional formatting help.

Local Prometheus-style monitoring & logging

After having used Prometheus for some projects for monitoring programs, I find the model usable and a niche I haven’t found in any existing profiling software.

Often profilers (at least in the python world) inject code into your program and get metrics on everything that is going on.

This is useful, but in many scenarios the results are overwhelming and quite complex.

The Prometheus approach is much like a traditional logger in that you have to actually write code to make data samples. Prometheus differs in that it supports structured data (albeit a usefully small number of things like Gauges, Counters, etc.) while logging is just some strings usually and is then queried using a full-text search engine.

The full text search on logs seems, unsatisfying. Not only have you gone the extra mile to both add logging in and collect performance data (object sizes timings etc.) but you now have a separate problem of parsing them.

Having first class support for the most common monitoring use cases is something that should be supported.

Logging software like eliot provides a vastly improved experience on standard loggers by outputting a JSON stream of log messages. By leveraging this you can report all kinds of data including numeric data.

As an aside eliot has a focus on root cause analysis and not monitoring, and as such it has a much higher overhead in injecting the code into your “business” code. I.e. it wouldn’t really be amenable in all cases to “aspect oriented” approaches like prometheus gauges would be. That said you don’t need to use these features in eliot.

The extra things on top of eliot that a Prometheus like system would have, is a special purpose time series database and query engine (PromQL).

The problems with Prometheus are that it is meant to be “cloud native” and as such is best deployed in a container cluster along with a bunch of other helper services that communicate via HTTP.

While this is an approach suited to a networked environment it increases the overhead of getting up and started by a lot.

The goal of this project would be to significantly reduce this complexity and not need to have http servers running for all programs being scraped.

This could simply be done over a RESTful virtual file system like 9P on FUSE.

Other than that prometheus places a great importance on shipping as a single binary (as most golang programs do) which is useful for moving it around and easing deployment.

In our situation we could be more oriented towards plugins and shipping more like a traditional python framework.

Plugins then would essentially replace the numerous side-car services that prometheus talks to (like the alertmanager) which are just “dockered” away so that the prometheus binary remains easy to deploy.

About

Request for comment proposals

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published