[Oxide] Automatic content detection #11173

adamwathan · 2023-05-07T01:24:42Z

This PR adds experimental support for what we're calling "automatic content detection" — a feature that lets Tailwind detect the paths it needs to scan to figure out which classes it needs to generate completely automatically, no content configuration required.

To use it, just omit the content option from your configuration file:

  // tailwind.config.js
  module.exports = {
    // Bye bye, you won't be missed
-   content: [
-     "./app/**/*.{js,ts,jsx,tsx,mdx}",
-     "./pages/**/*.{js,ts,jsx,tsx,mdx}",
-     "./components/**/*.{js,ts,jsx,tsx,mdx}",
-   ],
    theme: {
      extend: {},
    },
    plugins: [],
  }

You can also enable it explicitly by configuring content to 'auto':

// tailwind.config.js
module.exports = {
  content: 'auto',
  theme: {
    extend: {},
  },
  plugins: [],
}

Tailwind will automatically scan every file in your project (excluding gitignored files) that might contain classes and generate all the CSS you need with no configuration.

Since this feature is currently experimental, a warning is issued in the terminal any time it's enabled to make sure you know it's not stable.

This feature is only planned for the new Oxide engine, so it will only be available on the oxide-insiders tag and not the regular insiders tag. Will share a lot more about what Oxide is and when we really want people to start playing with it when it's a bit further along over the coming weeks.

How it works

Getting this to "just work" given all of the different places people use Tailwind is a serious challenge. This first stab at the problem uses a few heuristics and assumptions that are working very well for the types of projects we've tested it in:

Files are scanned using the current working directory — there's no magical way to know what the "root" of your project is, so Tailwind assumes that the current working directory is the root of your project. You should always run your build scripts from the root of your project to make sure the right files are scanned. This is only likely to be something you need to think about in complex monorepo-type setups, where you'll want to cd to the right folder in your scripts or use npm run with the --prefix option to explicitly set the CWD.
Any gitignored paths are skipped — any files or folders matched by your .gitignore file will not be scanned for classes. This prevents giant dependency folders like node_modules from being scanned, as well as any directories where you are storing generated files (like compiled CSS or JS), which avoids infinite rebuild loops.
All top-level folders are registered as content paths — we watch every top-level folder for changes as if you configured them yourself with globs like ./components/**/*.js and ./pages/**/*.js. This makes sure we notice any new files or folders you create and scan them without you needing to restart your build process.
...except ./public — we explicitly don't watch ./public/**/*.{whatever} because it's common to store generated assets like compiled CSS and JS in the ./public folder which can cause infinite rebuild loops, particularly in webpack where we can't actually register globs to watch and can only register directories. Instead, we explicitly watch each individual file in ./public that could contain classes, like an index.html file for instance.
Top-level files are watched individually — we can't watch ./**/* because webpack doesn't support globs which means we'd end up watching ./node_modules, so instead we watch every top-level file that might contain classes individually. This means creating a new top-level file currently requires restarting your build process. In practice though it's extremely rare to create new top-level files that contain classes.
All binary file extensions are skipped — we don't scan files that obviously won't contain classes, like images, videos, zip files, etc.
Stylesheet files are skipped — we don't look for classes in css, scss, sass, less, or styl files.
Common generated files are skipped — we explicitly don't scan known generated files like package-lock.json.
Tailwind configuration files are skipped — your tailwind.config.js file is never going to be a source of classes to include in the final CSS so we don't scan it.
Classes are detected using a known list of file extensions — we automatically watch for a long list of common file types that could contain Tailwind classes, like .js, .html, .php, even .json. This way you don't need to restart your dev server the first time you create one of these files in an existing project.
Additional template extensions are detected based on your specific project — when we scan your project for classes, we keep track of every file extension we see and add them to our master list of extensions to watch for your project. So if you are using some obscure templating language I've never heard of that uses the *.potato extension, we'll watch *.potato files in all folders as long as we see at least one *.potato file when the build process starts.

In our testing these heuristics work great, and we've been able to remove the content configuration from every one of our own projects that we've tried it in.

If for whatever reason these heuristics don't work properly for your project, you can explicitly configure content just like you were doing before, and Tailwind will respect that configuration and not try to do any automatic content detection at all. This way you always have the option of full control over which files are scanned for classes.

Known limitations

How things work currently isn't perfect, and there are a few known limitations you might run into depending on how your project is structured.

Creating a new file with an unseen extension that isn't in our safelist requires restarting your build process — if your build process is already running and you try to create your very first *.piledriver file, Tailwind won't notice it and you'll need to restart your script.
Running your build command from a different directory will scan the wrong paths — because we treat the current working directory as your project root, you need to run your build script from the right place or Tailwind will scan the wrong files. In practice you probably will never notice this — you are already doing this if you use npm run {command} because that's how npm run already works.
You can't force specific gitignored files to be scanned while automatically detecting every other file — if you have some specific files you need to scan that live in node_modules but you want to ignore everything else in node_modules, you currently need to opt-out of automatic content detection and go back to explicitly configuring your content paths.
Creating a new top-level folder that includes files that need to be scanned requires restarting your build process — since we only watch top-level folders for new file events and not the entire project root, creating new top-level folders requires restarting your build process. A common example of where you might run into this is creating a ./components folder for the first time in a Next.js project while your dev server is already running.
The only way to explicitly prevent scanning a path is to gitignore it — you can't tell Tailwind not to scan a folder for classes without also gitignoring that folder. If you need more control, you need to opt-out of automatic content detection and configure content explicitly.

Despite these limitations, we're still finding automatic content detection to be miles ahead of explicit content configuration in terms of developer experience, and for projects that are structured in a conventional way you pretty much don't ever see or feel these limitations at all.

Planned improvements

While we're ready to start shipping support for this in our oxide-insiders builds as-is, we do have some improvements we plan to explore that will hopefully make the experience even better:

Skipping gitignored folders within top-level folders — because of limitations with webpack's dependency tracking APIs our current implementation doesn't skip gitignored folders unless they are top-level. So if you have something like ./src/node_modules, we still scan that folder. We should be able to solve this though, maybe even before we merge this PR.
Support for scanning specific paths in addition to automatically detected paths — using something like a new @source "./node_modules/my-library/dist/**/*.js" directive in your CSS, we hope to make it possible to scan paths that live within ignored directories without opting out of automatic content detection. This will also make it possible to scan for classes in parent/sibling directories, which some people might need in certain monorepo setups.
Configure content paths more intelligently based on the running build tool — not all build tools offer the same amount of control when it comes to registering paths we need to watch for changes, with webpack being the most limited. Currently we are solving for the lowest common denominator, but that's where limitations like "can't notice newly created top-level folders" come from. We can technically solve that in tools like Vite that offer more control, but to do that we need to detect the build tool you're using and intelligently register different dependency paths. We plan to explore this and see what we can come up with.

Really excited about this one, I think it's the biggest step-function improvement to the developer experience in Tailwind since the JIT engine. Looking forward to getting everyone playing with it so we can refine our heuristics and get things feeling as rock-solid as possible.

ArnaudBarre · 2023-05-07T15:36:46Z

Hi, and first of all thanks for all the great work, the thoughts puts in every APIs and Tailwind in general, it totally changed the way I author CSS in the last three years.

Would it be possible for Tailwind to offer a more bundler friendly API that let another tool select the appropriate files to be scanned (and also handle the change detection)? The last point you made still feels like you want Tailwind to detect the build tool, instead of providing an API for build tools to create plugins on top of it.

I know the current setup is nice for people from other languages to even be able to run Tailwind without node, but for ESM first tools like Vite, we have some hacks in the HMR handling specifically for Tailwind.

Letting the bundler dictate the content being scan means that you only scan the code that is in the final bundle, which leads to multiple benefits:

faster: you don't scan unrelated content
multiple scoped css bundle in the same directory is easy
importing components from shared folder on level above the project folder works

jpsc · 2023-05-07T15:38:37Z

This seems like a great DX improvement for common projects. And this is coming from someone who doesn't have a problem with needing to set content.

Support for scanning specific paths in addition to automatically detected paths — using something like a new @source "./node_modules/my-library/dist/**/*.js" directive in your CSS, we hope to make it possible to scan paths that live within ignored directories without opting out of automatic content detection.

Why would this be better than setting content?

I will definitely need to use this or the content because we use a component library via an npm package and node_modules are usually gitignored.

adamwathan · 2023-05-07T16:48:45Z

@ArnaudBarre I think that all sounds great and would love to explore it more concretely — any interest in connecting about it sometime in the next week or two?

adamwathan · 2023-05-07T16:50:06Z

Why would this be better than setting content?

@jpsc It's the same really — another thing we're trying to do for v4 is support more configuration from your CSS file instead of needing the JS config is all. You already need a CSS file no matter what, it would be nice if you could do everything in one place instead of needing two files.

anonrig · 2023-05-07T16:55:16Z

integrations/io.js

+        !(await fs
+          .stat(filePath)
+          .then(() => true)
+          .catch(() => false))


existsSync is a better solution for this, and will avoid calling catch. Alternatively, statSync has { throwIfNoEntry: true } option to avoid the try/catch as well.

import fs from 'node:fs' if (!fs.existsSync(filePath)

src/util/normalizeConfig.js

anonrig · 2023-05-07T16:59:14Z

src/util/validateConfig.js

@@ -1,14 +1,21 @@
 import log from './log'

 export function validateConfig(config) {
-  if (config.content.files.length === 0) {
+  if (config.content.files !== 'auto' && config.content.files.length === 0) {


It seems this change is unnecessary, since files.length === 0 invalidates !== 'auto'

oxide/crates/core/src/lib.rs

ArnaudBarre · 2023-05-07T18:05:14Z

@adamwathan I would be happy to discuss/explore this! I am quite available until May 17 (on UTC+2 TZ)

RobinMalfait · 2023-05-07T20:45:37Z

Skipping gitignored folders within top-level folders — because of limitations with webpack's dependency tracking APIs our current implementation doesn't skip gitignored folders unless they are top-level. So if you have something like ./src/node_modules, we still scan that folder. We should be able to solve this though, maybe even before we merge this PR.

This part is solved now ✅

jpsc · 2023-05-08T06:08:13Z

Why would this be better than setting content?

@jpsc It's the same really — another thing we're trying to do for v4 is support more configuration from your CSS file instead of needing the JS config is all. You already need a CSS file no matter what, it would be nice if you could do everything in one place instead of needing two files.

Ok, that's understandable. Why would one file be less than two? So you could completely drop tailwind.config and just have the tailwind cli?

ChristophP · 2023-05-08T21:50:30Z

Hmm, as you mentioned this is not quite so easy to find heuristics that reliably find the files that are relevant. Good trick skipping binary files though.
but overall to me it seems like the cost of adding the content in the config is very low (one-time setup) compared to the potential pitfalls and additional unecessary filewatching (repeatedly) that I assume will likely happen with this approach.

adamwathan · 2023-05-12T13:48:23Z

Going to merge this in so it's easier for people to play with and test — no guarantees we don't change directions here but hard to know how it plays out without just giving it a shot and trying to iterate on it 👍

1. Files in the root should be listed statically instead of using globs. 2. Files and folders in special known direct child folders should be listed statically instead of using globs (e.g.: `public`). This is because these special folders are often used to store generated AND source files at the same time. Using globs could trigger infinite loops because we are watching and acting upon dist files. 3. All file extensions found in the project, should be used in the globs in addition to a known set of extensions. 4. Direct folders seen from the root, can use the glob syntax `<root>/src/**/*.{...known-extensions}`

Not 100% convinced yet, but seems cleaner so far.

This reverts commit 879c124.

This will make it a bit easier to organize in the future.

The config file will automatically trigger a rebuild when this file is changed. However, this should not be part of the template files because that could cause additional css that's not being used.

- In the oxide engine, the default `content: []` will be dropped from the default configuration (config.simple.js, config.full.js). - If you have `content: []` or `content: { files: [] }` then the auto content feature won't be active. However if those arrays are empty a warning will still be shown. Adding files/globs or dropping the `content` section completely will enable auto content.

This way we don't run into the issue where the `config.content.files` is set and the `config.content.auto` is set to true.

Thanks, Clippy!

This will also make sure that if we have (deeply) nested ignored folders, then we won't use deeply nested globs (**/*.{js,html}) for the parent(s) of the nested ignored folders but instead use a shallow glob for each directory (*/*.{js,html}). Then each sibling directory of the parent can use deeply nested globs again except for the direct parent.

On a big test project this goes from ~6s to ~200ms

We started with a ~6s duration Then in the previous commit, we improved it by ~30x and it went down to ~200ms Now with this change, it takes about ~40ms. That's another ~5x improvement. Or in total a ~150x improvement.

This is only called once so won't do anything to the main performance of Tailwind CSS. But always nice to make small performance improvements!

* resolve all _existing_ content paths * pin `@napi-rs/cli` * WIP: Log all resolved content files/globs * only filter out raw changed content in non-auto mode * skip parseCandidateFiles cache in `auto` mode * improve algorithm of detecting content paths 1. Files in the root should be listed statically instead of using globs. 2. Files and folders in special known direct child folders should be listed statically instead of using globs (e.g.: `public`). This is because these special folders are often used to store generated AND source files at the same time. Using globs could trigger infinite loops because we are watching and acting upon dist files. 3. All file extensions found in the project, should be used in the globs in addition to a known set of extensions. 4. Direct folders seen from the root, can use the glob syntax `<root>/src/**/*.{...known-extensions}` * inline wanted-extensions Not 100% convinced yet, but seems cleaner so far. * ensure writing an file also makes the parent folder(s) * add integration tests for the auto content feature * add pnpm and bun lock files * Revert "inline wanted-extensions" This reverts commit 879c124. * sort binary-extensions and add lockb * sort + add `lock` to ignored extensions * drop `yarn.lock`, because lock extensions are already covered * group template extensions This will make it a bit easier to organize in the future. * drop empty lines and commented lines from template-extensions * skip the config path when resolving template files The config file will automatically trigger a rebuild when this file is changed. However, this should not be part of the template files because that could cause additional css that's not being used. * make `auto content` the default in the oxide engine - In the oxide engine, the default `content: []` will be dropped from the default configuration (config.simple.js, config.full.js). - If you have `content: []` or `content: { files: [] }` then the auto content feature won't be active. However if those arrays are empty a warning will still be shown. Adding files/globs or dropping the `content` section completely will enable auto content. * only test the auto content integration test in the oxide engine * set `content.files` to `auto` instead of using `auto: boolean` This way we don't run into the issue where the `config.content.files` is set and the `config.content.auto` is set to true. * drop log * ensure we validate the config in the CLI * show experimental warning for automatic content detection * use cached version of the getCandidateFiles instead of bypassing it * use `is_empty()` shorthand Thanks, Clippy! * add test to ensure nested ignored folders are not scanned * add `tempfile` for tests * add auto content tests in Rust * refactor auto content detection This will also make sure that if we have (deeply) nested ignored folders, then we won't use deeply nested globs (**/*.{js,html}) for the parent(s) of the nested ignored folders but instead use a shallow glob for each directory (*/*.{js,html}). Then each sibling directory of the parent can use deeply nested globs again except for the direct parent. * use consistent comments * ensure ignored static listed files are not present * improve performance by ~30x On a big test project this goes from ~6s to ~200ms * improve performance by ~5x We started with a ~6s duration Then in the previous commit, we improved it by ~30x and it went down to ~200ms Now with this change, it takes about ~40ms. That's another ~5x improvement. Or in total a ~150x improvement. * ensure nested folders in `public/` are also explicitly listed * add shortcut for normalizing files This is only called once so won't do anything to the main performance of Tailwind CSS. But always nice to make small performance improvements! * run Rust tests in CI * fix lint warnings * update changelog * Update CHANGELOG.md --------- Co-authored-by: Robin Malfait <malfait.robin@gmail.com>

anonrig reviewed May 7, 2023

View reviewed changes

RobinMalfait added 20 commits May 12, 2023 15:50

resolve all _existing_ content paths

1a486fb

pin @napi-rs/cli

125b389

WIP: Log all resolved content files/globs

cdd1490

only filter out raw changed content in non-auto mode

fc9f063

skip parseCandidateFiles cache in auto mode

60801cc

inline wanted-extensions

774f0a6

Not 100% convinced yet, but seems cleaner so far.

ensure writing an file also makes the parent folder(s)

65d9d49

add integration tests for the auto content feature

23c4cd1

add pnpm and bun lock files

099c0d2

Revert "inline wanted-extensions"

c1c87ef

This reverts commit 879c124.

sort binary-extensions and add lockb

abbcd0b

sort + add lock to ignored extensions

1a59e23

drop yarn.lock, because lock extensions are already covered

fd8a392

group template extensions

765a16a

This will make it a bit easier to organize in the future.

drop empty lines and commented lines from template-extensions

588d991

skip the config path when resolving template files

5339e05

The config file will automatically trigger a rebuild when this file is changed. However, this should not be part of the template files because that could cause additional css that's not being used.

only test the auto content integration test in the oxide engine

2f4bc75

set content.files to auto instead of using auto: boolean

861b15b

This way we don't run into the issue where the `config.content.files` is set and the `config.content.auto` is set to true.

RobinMalfait added 18 commits May 12, 2023 15:50

drop log

118f105

ensure we validate the config in the CLI

474cf71

show experimental warning for automatic content detection

cbfef37

use cached version of the getCandidateFiles instead of bypassing it

735c278

use is_empty() shorthand

06201a0

Thanks, Clippy!

add test to ensure nested ignored folders are not scanned

3655b10

add tempfile for tests

81e53be

add auto content tests in Rust

a0bbac5

use consistent comments

8f20a70

ensure ignored static listed files are not present

f1213da

improve performance by ~30x

18da7a5

On a big test project this goes from ~6s to ~200ms

improve performance by ~5x

eb9c056

We started with a ~6s duration Then in the previous commit, we improved it by ~30x and it went down to ~200ms Now with this change, it takes about ~40ms. That's another ~5x improvement. Or in total a ~150x improvement.

ensure nested folders in public/ are also explicitly listed

50f8820

add shortcut for normalizing files

bdbb852

This is only called once so won't do anything to the main performance of Tailwind CSS. But always nice to make small performance improvements!

run Rust tests in CI

f11b270

fix lint warnings

0dcd096

update changelog

289bc20

RobinMalfait force-pushed the feat/auto-content branch from b66d8ae to 289bc20 Compare May 12, 2023 13:51

Update CHANGELOG.md

82d626c

RobinMalfait merged commit a7f7b76 into master May 12, 2023
21 checks passed

RobinMalfait deleted the feat/auto-content branch May 12, 2023 14:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Oxide] Automatic content detection #11173

[Oxide] Automatic content detection #11173

adamwathan commented May 7, 2023

ArnaudBarre commented May 7, 2023 •

edited

Loading

jpsc commented May 7, 2023

adamwathan commented May 7, 2023

adamwathan commented May 7, 2023

anonrig May 7, 2023

anonrig May 7, 2023

ArnaudBarre commented May 7, 2023

RobinMalfait commented May 7, 2023

jpsc commented May 8, 2023

ChristophP commented May 8, 2023

adamwathan commented May 12, 2023

[Oxide] Automatic content detection #11173

[Oxide] Automatic content detection #11173

Conversation

adamwathan commented May 7, 2023

How it works

Known limitations

Planned improvements

ArnaudBarre commented May 7, 2023 • edited Loading

jpsc commented May 7, 2023

adamwathan commented May 7, 2023

adamwathan commented May 7, 2023

anonrig May 7, 2023

Choose a reason for hiding this comment

anonrig May 7, 2023

Choose a reason for hiding this comment

ArnaudBarre commented May 7, 2023

RobinMalfait commented May 7, 2023

jpsc commented May 8, 2023

ChristophP commented May 8, 2023

adamwathan commented May 12, 2023

ArnaudBarre commented May 7, 2023 •

edited

Loading