Each job listing begins with the following ordered, pipe-delimited metadata, followed by any number of paragraphs for further info. The unique string "metafriendly" is added to the post to let parsers know that it conforms to the spec
[Company name] | [Job title] | [location(s), semi-colon delimited] | [Remote/Onsite, semi-colon delimited] | [Full-Time/Part-Time/Intern] | [Citizen/Visa (optional type, semi-colon delimited)] | [Optional list of semi-colon delimited keywords] [Additional Freeform Information] metafriendly
Acme Products | Test Engineer | Las Vegas, NV; Austin, TX | Onsite; Remote | Full-Time; Part-Time | Visa (H1B) | Tunnel Theory; Kinematics Engineer needed to test prototype products. Must be able to lift and carry anvils. metafriendly
- Location: MUST be geocodable; Any string MUST give one unambiguous set of LatLng coordinates.
- Onsite/Remote: The word
RemoteSHOULD only be used in the positive sense; phrases such as
No Remoteare not allowed.
- Citizen/Visa: It can be assumed that by stating
Visa, the job posting also accepts applicants that are
Citizencan be omitted. Please note that this is not true for other attributes; it is possible for companies to look for Remote only applicants, or only Part-Time applicants.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Why have a spec at all?
To be better able to filter out irrelevant job listings. There are a number of third party sites that parse whoishiring information, but are unable to accurately do so because of inconsistent formatting. Most notably, false positives on
remote work, being unable to accurately determine job location, and mistaking the keyword
intern with international in job descriptions.
Why not a more machine friendly format, like YAML?
A line break after each attribute would balloon the vertical size of posts quite a bit, making the original page of posts not as useful. The spec can be easilly parsed by using regular expressions as long as the spec is followed.
What if I don't want to share the company name?
Stealth Company or something in the beginning.
This pipe-delimited spec is mostly stolen from this comment by danso. This comment by phantom_oracle convinced me to add the Onsite/Remote distinction.