cube-js · hassankhan · Apr 27, 2023 · Apr 21, 2023 · Apr 26, 2023 · Apr 26, 2023
diff --git a/docs/.prettierrc b/docs/.prettierrc
@@ -3,7 +3,7 @@
   "tabWidth": 2,
   "useTabs": false,
   "semi": true,
-  "singleQuote": true,
+  "singleQuote": false,
   "arrowParens": "always",
   "trailingComma": "es5",
   "bracketSpacing": true,

diff --git a/docs/content/Auth/Security-Context.mdx b/docs/content/Auth/Security-Context.mdx
@@ -11,7 +11,7 @@ context claims to evaluate access control rules. Inbound JWTs are decoded and
 verified using industry-standard [JSON Web Key Sets (JWKS)][link-auth0-jwks].
 
 For access control or authorization, Cube allows you to define granular access
-control rules for every cube in your data schema. Cube uses both the request and
+control rules for every cube in your data model. Cube uses both the request and
 security context claims in the JWT token to generate a SQL query, which includes
 row-level constraints from the access control rules.
 
@@ -132,11 +132,11 @@ LIMIT 10000
 In the example below `user_id`, `company_id`, `sub` and `iat` will be injected
 into the security context and will be accessible in both the [Security
 Context][ref-schema-sec-ctx] and [`COMPILE_CONTEXT`][ref-cubes-compile-ctx]
-global variable in the Cube Data Schema.
+global variable in the Cube data model.
 
 <InfoBox>
 
-`COMPILE_CONTEXT` is used by Cube at schema compilation time, which allows
+`COMPILE_CONTEXT` is used by Cube at data model compilation time, which allows
 changing the underlying dataset completely; the Security Context is only used at
 query execution time, which simply filters the dataset with a `WHERE` clause.
 
@@ -151,8 +151,8 @@ query execution time, which simply filters the dataset with a `WHERE` clause.
 }
 ```
 
-With the same JWT payload as before, we can modify schemas before they are
-compiled. The following schema will ensure users only see results for their
+With the same JWT payload as before, we can modify models before they are
+compiled. The following cube will ensure users only see results for their
 `company_id` in a multi-tenant deployment:
 
 ```javascript

diff --git a/docs/content/Caching/Getting-Started-Pre-Aggregations.mdx b/docs/content/Caching/Getting-Started-Pre-Aggregations.mdx
@@ -40,7 +40,7 @@ layer][ref-caching-preaggs-cubestore].
 ## Pre-Aggregations without Time Dimension
 
 To illustrate pre-aggregations with an example, let's use a sample e-commerce
-database. We have a schema representing all our `Orders`:
+database. We have a data model representing all our `Orders`:
 
 ```javascript
 cube(`Orders`, {
@@ -106,9 +106,9 @@ cube(`Orders`, {
 
 ## Pre-Aggregations with Time Dimension
 
-Using the same schema as before, we are now finding that users frequently query
-for the number of orders completed per day, and that this query is performing
-poorly. This query might look something like:
+Using the same data model as before, we are now finding that users frequently
+query for the number of orders completed per day, and that this query is
+performing poorly. This query might look something like:
 
 ```json
 {
@@ -118,7 +118,7 @@ poorly. This query might look something like:
 ```
 
 In order to improve the performance of this query, we can add another
-pre-aggregation definition to the `Orders` schema:
+pre-aggregation definition to the `Orders` cube:
 
 ```javascript
 cube(`Orders`, {
@@ -245,7 +245,7 @@ fields and still get a correct result:
 | 2021-01-22 00:00:00.000000 | 13           | 150       |
 
 This means that `quantity` and `price` are both **additive measures**, and we
-can represent them in the `LineItems` schema as follows:
+can represent them in the `LineItems` cube as follows:
 
 ```javascript
 cube(`LineItems`, {
@@ -340,7 +340,7 @@ $$
 We can clearly see that `523` **does not** equal `762.204545454545455`, and we
 cannot treat the `profit_margin` column the same as we would any other additive
 measure. Armed with the above knowledge, we can add the `profit_margin` field to
-our schema **as a [dimension][ref-schema-dims]**:
+our cube **as a [dimension][ref-schema-dims]**:
 
 ```javascript
 cube(`LineItems`, {
@@ -437,17 +437,15 @@ To recap what we've learnt so far:
   `count`, `sum`, `min`, `max` or `countDistinctApprox`
 
 Cube looks for matching pre-aggregations in the order they are defined in a
-cube's schema file. Each defined pre-aggregation is then tested for a match
+cube's data model file. Each defined pre-aggregation is then tested for a match
 based on the criteria in the flowchart below:
 
-<div
-  style="text-align: center"
->
+<div style="text-align: center">
   <img
-  alt="Pre-Aggregation Selection Flowchart"
-  src="https://ucarecdn.com/f986b0cb-a9ea-47b7-a743-ca9a4644c246/"
-  style="border: none"
-  width="100%"
+    alt="Pre-Aggregation Selection Flowchart"
+    src="https://ucarecdn.com/f986b0cb-a9ea-47b7-a743-ca9a4644c246/"
+    style="border: none"
+    width="100%"
   />
 </div>
 
@@ -470,7 +468,7 @@ Some extra considerations for pre-aggregation selection:
   `['2020-01-01T00:00:00.000', '2020-01-01T23:59:59.999']`. Date ranges are
   inclusive, and the minimum granularity is `second`.
 
-- The order in which pre-aggregations are defined in schemas matter; the first
+- The order in which pre-aggregations are defined in models matter; the first
   matching pre-aggregation for a query is the one that is used. Both the
   measures and dimensions of any cubes specified in the query are checked to
   find a matching `rollup`.

diff --git a/docs/content/Caching/Overview.mdx b/docs/content/Caching/Overview.mdx
@@ -49,8 +49,8 @@ more about read-only support and pre-aggregation build strategies.
 
 </InfoBox>
 
-Pre-aggregations are defined in the data schema. You can learn more about
-defining pre-aggregations in [schema reference][ref-schema-ref-preaggs].
+Pre-aggregations are defined in the data model. You can learn more about
+defining pre-aggregations in [data modeling reference][ref-schema-ref-preaggs].
 
 ```javascript
 cube(`Orders`, {
@@ -142,10 +142,9 @@ The default values for `refreshKey` are
 - `every: '10 second'` for all other databases.
 
 +You can use a custom SQL query to check if a refresh is required by changing
-the [`refreshKey`][ref-schema-ref-cube-refresh-key] property in a cube's Data
-Schema. Often, a `MAX(updated_at_timestamp)` for OLTP data is a viable option,
-or examining a metadata table for whatever system is managing the data to see
-when it last ran.
+the [`refreshKey`][ref-schema-ref-cube-refresh-key] property in a cube. Often, a
+`MAX(updated_at_timestamp)` for OLTP data is a viable option, or examining a
+metadata table for whatever system is managing the data to see when it last ran.
 
 ### <--{"id" : "In-memory Cache"}--> Disabling the cache
 

diff --git a/docs/content/Caching/Using-Pre-Aggregations.mdx b/docs/content/Caching/Using-Pre-Aggregations.mdx
@@ -7,7 +7,8 @@ menuOrder: 3
 
 Pre-aggregations is a powerful way to speed up your Cube queries. There are many
 configuration options to consider. Please make sure to also check [the
-Pre-Aggregations reference in the data schema section][ref-schema-ref-preaggs].
+Pre-Aggregations reference in the data modeling
+section][ref-schema-ref-preaggs].
 
 ## Refresh Strategy
 

diff --git a/docs/content/Configuration/Advanced/Multitenancy.mdx b/docs/content/Configuration/Advanced/Multitenancy.mdx
@@ -6,7 +6,7 @@ subCategory: Advanced
 menuOrder: 3
 ---
 
-Cube supports multitenancy out of the box, both on database and data schema
+Cube supports multitenancy out of the box, both on database and data model
 levels. Multiple drivers are also supported, meaning that you can have one
 customer’s data in MongoDB and others in Postgres with one Cube instance.
 
@@ -34,7 +34,7 @@ combinations of these configuration options.
 
 ### <--{"id" : "Multitenancy"}--> Multitenancy vs Multiple Data Sources
 
-In cases where your Cube schema is spread across multiple different data
+In cases where your Cube data model is spread across multiple different data
 sources, consider using the [`dataSource` cube property][ref-cube-datasource]
 instead of multitenancy. Multitenancy is designed for cases where you need to
 serve different datasets for multiple users, or tenants which aren't related to
@@ -169,7 +169,7 @@ cube(`Products`, {
 ### <--{"id" : "Multitenancy"}--> Running in Production
 
 Each unique id generated by `contextToAppId` or `contextToOrchestratorId` will
-generate a dedicated set of resources, including schema compile cache, SQL
+generate a dedicated set of resources, including data model compile cache, SQL
 compile cache, query queues, in-memory result caching, etc. Depending on your
 data model complexity and usage patterns, those resources can have a pretty
 sizable memory footprint ranging from single-digit MBs on the lower end and
@@ -219,7 +219,7 @@ module.exports = {
 };
 ```
 
-## Multiple DB Instances with Same Schema
+## Multiple DB Instances with Same Data Model
 
 Let's consider an example where we store data for different users in different
 databases, but on the same Postgres host. The database name format is
@@ -249,12 +249,12 @@ select the database, based on the `appId` and `userId`:
 <WarningBox>
 
 The App ID (the result of [`contextToAppId`][ref-config-ctx-to-appid]) is used
-as a caching key for various in-memory structures like schema compilation
+as a caching key for various in-memory structures like data model compilation
 results, connection pool. The Orchestrator ID (the result of
 [`contextToOrchestratorId`][ref-config-ctx-to-orch-id]) is used as a caching key
 for database connections, execution queues and pre-aggregation table caches. Not
-declaring these properties will result in unexpected caching issues such as
-schema or data of one tenant being used for another.
+declaring these properties will result in unexpected caching issues such as the
+data model or data of one tenant being used for another.
 
 </WarningBox>
 
@@ -292,7 +292,7 @@ module.exports = {
 };
 ```
 
-## Multiple Schema and Drivers
+## Multiple Data Models and Drivers
 
 What if for application with ID 3, the data is stored not in Postgres, but in
 MongoDB?
@@ -301,9 +301,9 @@ We can instruct Cube to connect to MongoDB in that case, instead of Postgres. To
 do this, we'll use the [`driverFactory`][ref-config-driverfactory] option to
 dynamically set database type. We will also need to modify our
 [`securityContext`][ref-config-security-ctx] to determine which tenant is
-requesting data. Finally, we want to have separate data schemas for every
+requesting data. Finally, we want to have separate data models for every
 application. We can use the [`repositoryFactory`][ref-config-repofactory] option
-to dynamically set a repository with schema files depending on the `appId`:
+to dynamically set a repository with data model files depending on the `appId`:
 
 **cube.js:**
 

diff --git a/docs/content/Configuration/Downstream/Superset.mdx b/docs/content/Configuration/Downstream/Superset.mdx
@@ -69,7 +69,7 @@ a new database:
 Your cubes will be exposed as tables, where both your measures and dimensions
 are columns.
 
-Let's use the following Cube data schema:
+Let's use the following Cube data model:
 
 ```javascript
 cube(`Orders`, {
@@ -124,7 +124,7 @@ a time grain of `month`.
 
 The `COUNT(*)` aggregate function is being mapped to a measure of type
 [count](/schema/reference/types-and-formats#measures-types-count) in Cube's
-**Orders** schema file.
+**Orders** data model file.
 
 ## Additional Configuration
 

diff --git a/docs/content/Deployment/Cloud/Continuous-Deployment.mdx b/docs/content/Deployment/Cloud/Continuous-Deployment.mdx
@@ -56,16 +56,19 @@ Cube Cloud will automatically deploy from the specified production branch
 
 <WarningBox>
 
-Enabling this option will cause the Schema page to display the last known state of a Git-based codebase (if available), instead of reflecting the latest modifications made.
-It is important to note that the logic will still be updated in both the API and the Playground.
+Enabling this option will cause the <Btn>Data Model</Btn> page to display the
+last known state of a Git-based codebase (if available), instead of reflecting
+the latest modifications made. It is important to note that the logic will still
+be updated in both the API and the Playground.
+
 </WarningBox>
 
 You can use the CLI to set up continuous deployment for a Git repository. You
 can also use the CLI to manually deploy changes without continuous deployment.
 
 ### <--{"id" : "Deploy with CLI"}--> Manual Deploys
 
-You can deploy your Cube project manually. This method uploads data schema and
+You can deploy your Cube project manually. This method uploads data models and
 configuration files directly from your local project directory.
 
 You can obtain Cube Cloud deploy token from your deployment **Settings** page.

diff --git a/docs/content/Deployment/Overview.mdx b/docs/content/Deployment/Overview.mdx
@@ -42,7 +42,7 @@ API instances.
 
 API instances and Refresh Workers can be configured via [environment
 variables][ref-config-env] or the [`cube.js` configuration file][ref-config-js].
-They also need access to the data schema files. Cube Store clusters can be
+They also need access to the data model files. Cube Store clusters can be
 configured via environment variables.
 
 You can find an example Docker Compose configuration for a Cube deployment in
@@ -57,21 +57,22 @@ requests between multiple API instances.
 
 The [Cube Docker image][dh-cubejs] is used for API Instance.
 
-API instance needs to be configured via environment variables, cube.js file and
-has access to the data schema files.
+API instances can be configured via environment variables or the `cube.js`
+configuration file, and **must** have access to the data model files (as
+specified by [`schemaPath`][ref-conf-ref-schemapath].
 
 ## Refresh Worker
 
 A Refresh Worker updates pre-aggregations and invalidates the in-memory cache in
-the background. They also keep the refresh keys up-to-date for all defined
-schemas and pre-aggregations. Please note that the in-memory cache is just
-invalidated but not populated by Refresh Worker. In-memory cache is populated
-lazily during querying. On the other hand, pre-aggregations are eagerly
-populated and kept up-to-date by Refresh Worker.
+the background. They also keep the refresh keys up-to-date for all data models
+and pre-aggregations. Please note that the in-memory cache is just invalidated
+but not populated by Refresh Worker. In-memory cache is populated lazily during
+querying. On the other hand, pre-aggregations are eagerly populated and kept
+up-to-date by Refresh Worker.
 
-[Cube Docker image][dh-cubejs] can be used for creating Refresh Workers; to make
-the service act as a Refresh Worker, `CUBEJS_REFRESH_WORKER=true` should be set
-in the environment variables.
+The [Cube Docker image][dh-cubejs] can be used for creating Refresh Workers; to
+make the service act as a Refresh Worker, `CUBEJS_REFRESH_WORKER=true` should be
+set in the environment variables.
 
 ## Cube Store
 
@@ -275,6 +276,7 @@ guide][blog-migration-guide].
 [ref-deploy-docker]: /deployment/platforms/docker
 [ref-config-env]: /reference/environment-variables
 [ref-config-js]: /config
+[ref-conf-ref-schemapath]: /config#options-reference-schema-path
 [redis]: https://redis.io
 [ref-config-redis]: /reference/environment-variables#cubejs-redis-password
 [blog-details]: https://cube.dev/blog/how-you-win-by-using-cube-store-part-1

diff --git a/docs/content/Deployment/Production-Checklist.mdx b/docs/content/Deployment/Production-Checklist.mdx
@@ -97,37 +97,45 @@ deployment's health and be alerted to any issues.
 
 ## Appropriate cluster sizing
 
-There's no one-size-fits-all when it comes to sizing Cube cluster, and its resources.
-Resources required by Cube depend a lot on the amount of traffic Cube needs to serve and the amount of data it needs to process.
-The following sizing estimates are based on default settings and are very generic, which may not fit your Cube use case, so you should always tweak resources based on consumption patterns you see.
+There's no one-size-fits-all when it comes to sizing a Cube cluster and its
+resources. Resources required by Cube significantly depend on the amount of
+traffic Cube needs to serve and the amount of data it needs to process. The
+following sizing estimates are based on default settings and are very generic,
+which may not fit your Cube use case, so you should always tweak resources based
+on consumption patterns you see.
 
 ### <--{"id" : "Appropriate cluster sizing"}--> Memory and CPU
 
-Each Cube cluster should contain at least 2 Cube API instances.
-Every Cube API instance should have at least 3GB of RAM and 2 CPU cores allocated for it.
+Each Cube cluster should contain at least 2 Cube API instances. Every Cube API
+instance should have at least 3GB of RAM and 2 CPU cores allocated for it.
 
-Refresh workers tend to be much more CPU and memory intensive, so at least 6GB of RAM is recommended.
-Please note that to take advantage of all available RAM, the Node.js heap size should be adjusted accordingly
-by using the [`--max-old-space-size` option][node-heap-size]:
+Refresh workers tend to be much more CPU and memory intensive, so at least 6GB
+of RAM is recommended. Please note that to take advantage of all available RAM,
+the Node.js heap size should be adjusted accordingly by using the
+[`--max-old-space-size` option][node-heap-size]:
 
 ```sh
 NODE_OPTIONS="--max-old-space-size=6144"
 ```
 
-[node-heap-size]: https://nodejs.org/api/cli.html#--max-old-space-sizesize-in-megabytes
+[node-heap-size]:
+  https://nodejs.org/api/cli.html#--max-old-space-sizesize-in-megabytes
 
-The Cube Store router node should have at least 6GB of RAM and 4 CPU cores allocated for it.
-Every Cube Store worker node should have at least 8GB of RAM and 4 CPU cores allocated for it.
-The Cube Store cluster should have at least two worker nodes.
+The Cube Store router node should have at least 6GB of RAM and 4 CPU cores
+allocated for it. Every Cube Store worker node should have at least 8GB of RAM
+and 4 CPU cores allocated for it. The Cube Store cluster should have at least
+two worker nodes.
 
 ### <--{"id" : "Appropriate cluster sizing"}--> RPS and data volume
 
-Depending on schema size, every Core Cube API instance can serve 1 to 10 requests per second.
-Every Core Cube Store router node can serve 50-100 queries per second.
-As a rule of thumb, you should provision 1 Cube Store worker node per one Cube Store partition or 1M of rows scanned in a query.
-For example if your queries scan 16M of rows per query, you should have at least 16 Cube Store worker nodes provisioned.
-`EXPLAIN ANALYZE` can be used to see partitions involved in a Cube Store query.
-Cube Cloud ballpark performance numbers can differ as it has different Cube runtime.
+Depending on data model size, every Core Cube API instance can serve 1 to 10
+requests per second. Every Core Cube Store router node can serve 50-100 queries
+per second. As a rule of thumb, you should provision 1 Cube Store worker node
+per one Cube Store partition or 1M of rows scanned in a query. For example if
+your queries scan 16M of rows per query, you should have at least 16 Cube Store
+worker nodes provisioned. `EXPLAIN ANALYZE` can be used to see partitions
+involved in a Cube Store query. Cube Cloud ballpark performance numbers can
+differ as it has different Cube runtime.
 
 [blog-migrate-to-cube-cloud]:
   https://cube.dev/blog/migrating-from-self-hosted-to-cube-cloud/