This is a collection of SQL for implementing geocoding in Clickhouse. It's effectively a version-controlled (and probably extended) copy of the code from my blog post Geocoding IP Addresses with Clickhouse. This is based on a blog post from Clickhouse and Guillaume Matheron's work extending this to IPv6.
To set this up in the first place, apply the SQL from
step-1-basic-geocoding.sql to Clickhouse.  It will create 3 tables,
one materialized view, and one dictionary.
If you need to update this for any reason, then you'll probably need
to manually drop the dictionary and the materialized view, as
Clickhouse doesn't currently have CREATE OR REPLACE for either.
On my machine, it takes up to 30 seconds to create the ip_trie
dictionary on first use, but after that updates shouldn't add
substantial latency.  The underlying geocoding data gets updates
monthly, and this will automatically update its view of the data
roughly once per week.
Note that creating dictionaries in Clickhouse frequently requires you
to provide a Clickhouse username and
password
(or other auth method) as part of the CREATE DICTIONARY command.
This is true even if you're accessing data in the same Clickhouse
instance.  If you miss this, then attempts to use the dict will get
authentication errors.  If this happens to you, then DROP DICTIONARY ip_trie and re-add it, changing SOURCE(clickhouse(table ‘geoip’))
to SOURCE(clickhouse(table 'geoip' user '...' password '...' db '...')).
If you add step-2-adding-isp-information.sql, then you'll also have
data on which organization owns which IP addresses.  This will be
whichever organization announces the IP block via BGP, so sometimes
it'll be the end company and sometimes it'll be their ISP.
See Geocoding IP Addresses with Clickhouse for more detail, but here's the short version
SELECT
  ip,
  dictGet('ip_trie', 'country_code', toIPv6(ip)) AS country
FROM ...You can fetch multiple fields from the ip_trie dictionary in a
single lookup; see the blog post for syntax.
You can use dictGet('ip_asn_trie', 'as_organization', toIPv6(isp)) as as_organization to fetch the name of the organization that owns the
BGP announcement for any given IP.
The data that this uses is freely available, but is licensed by DB-IP and requires attribution for use. See https://github.com/sapics/ip-location-db/tree/main/dbip-city for details.