Skip to content
Permalink
Browse files

initial commit

  • Loading branch information...
feross committed Jul 15, 2018
0 parents commit ebf1030da441bfc6132f8f3554e206bc9dc19747
Showing with 240,498 additions and 0 deletions.
  1. +2 −0 .npmignore
  2. +3 −0 .travis.yml
  3. +20 −0 LICENSE
  4. +154 −0 README.md
  5. +137 −0 index.js
  6. +46 −0 package.json
  7. +119 −0 test/basic.js
  8. +200,003 −0 test/large-sitemap-0.xml
  9. +40,003 −0 test/large-sitemap-1.xml
  10. +11 −0 test/large-sitemap.xml
@@ -0,0 +1,2 @@
.travis.yml
test/
@@ -0,0 +1,3 @@
language: node_js
node_js:
- 'lts/*'
20 LICENSE
@@ -0,0 +1,20 @@
The MIT License (MIT)

Copyright (c) Feross Aboukhadijeh

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
154 README.md
@@ -0,0 +1,154 @@
# express-sitemap-xml [![travis][travis-image]][travis-url] [![npm][npm-image]][npm-url] [![downloads][downloads-image]][downloads-url] [![javascript style guide][standard-image]][standard-url]

[travis-image]: https://img.shields.io/travis/feross/express-sitemap-xml/master.svg
[travis-url]: https://travis-ci.org/feross/express-sitemap-xml
[npm-image]: https://img.shields.io/npm/v/express-sitemap-xml.svg
[npm-url]: https://npmjs.org/package/express-sitemap-xml
[downloads-image]: https://img.shields.io/npm/dm/express-sitemap-xml.svg
[downloads-url]: https://npmjs.org/package/express-sitemap-xml
[standard-image]: https://img.shields.io/badge/code_style-standard-brightgreen.svg
[standard-url]: https://standardjs.com

### Serve sitemap.xml from a list of URLs in Express

This package automatically handles sitemaps with more than 50,000 URLs. In these
cases, multiple sitemap files will be generated along with a "sitemap index" to
comply with the [sitemap spec](https://www.sitemaps.org/protocol.html) and
requirements from search engines like Google.

If only one sitemap file is needed (i.e. there are less than 50,000 URLs) then
it is served directly at `/sitemap.xml`. Otherwise, a sitemap index is served at
`/sitemap.xml` and sitemaps at `/sitemap-0.xml`, `/sitemap-1.xml`, etc.

## Install

```
npm install express-sitemap-xml
```

## Demo

You can see this package in action on [BitMidi](https://bitmidi.com), a site for
listening to your favorite MIDI files.

## Usage (with Express)

The easiest way to use this package is with the Express middleware.

```js
const express = require('express')
const expressSitemapXml = require('express-sitemap-xml')
const app = express()
app.use(expressSitemapXml(loadUrls, 'https://bitmidi.com'))
async function loadUrls () {
return await getUrlsFromDatabase()
}
```

Remember to add a `Sitemap` line to `robots.txt` like this:

```
Sitemap: https://bitmidi.com/sitemap.xml
```

## Usage (without Express)

The package can also be used without the Express middleware.

```js
const { buildSitemaps } = require('express-sitemap-xml')
async function run () {
const urls = ['/1', '/2', '/3']
const sitemaps = await buildSitemaps(urls, 'https://bitmidi.com')
console.log(Object.keys(sitemaps))
// ['/sitemap.xml']
console.log(sitemaps['/sitemap.xml'])
// `<?xml version="1.0" encoding="utf-8"?>
// <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
// <url>
// <loc>https://bitmidi.com/1</loc>
// <lastmod>${getTodayStr()}</lastmod>
// </url>
// <url>
// <loc>https://bitmidi.com/2</loc>
// <lastmod>${getTodayStr()}</lastmod>
// </url>
// <url>
// <loc>https://bitmidi.com/3</loc>
// <lastmod>${getTodayStr()}</lastmod>
// </url>
// </urlset>`
})
```

Remember to add a `Sitemap` line to `robots.txt` like this:

```
Sitemap: https://bitmidi.com/sitemap.xml
```

## API

### `middleware = expressSitemapXml(loadUrls, base)`

Create a `sitemap.xml` middleware. Both arguments are required.

The `loadUrls` argument specifies an async function that resolves to an array of
URLs to be included in the sitemap. Each URL in the array can either be an
absolute or relative URL string like `'/1'`, or an object specifying additional
options about the URL:

```js
{
url: '/1',
lastMod: new Date('2000-02-02'),
changeFreq: 'weekly'
}
```

For more information about these options, see the [sitemap spec](https://www.sitemaps.org/protocol.html). Note that the `priority` option is not supported because [Google ignores it](https://twitter.com/methode/status/846796737750712320).

The `base` argument specifies the base URL to be used in case any URLs are
specified as relative URLs. The argument is also used if a sitemap index needs
to be generated and sitemap locations need to be specified, e.g.
`${base}/sitemap-0.xml` becomes `https://bitmidi.com/sitemap-0.xml`.

### `sitemaps = expressSitemapXml.buildSitemaps(urls, base)`

Create an object where the keys are sitemap URLs to be served by the server and
the values are strings of sitemap XML content.

The `urls` argument is an array of URLs to be included in the sitemap. Each URL
in the array can either be an absolute or relative URL string like `'/1'`, or an
object specifying additional options about the URL. See above for more info
about the options.

The `base` argument is the same as above.

The return value is an object that looks like this:

```js
{
'/sitemap.xml': '<?xml version="1.0" encoding="utf-8"?>...'
}
```

Or if multiple sitemaps are needed, then the return object looks like this:

```js
{
'/sitemap.xml': '<?xml version="1.0" encoding="utf-8"?>...',
'/sitemap-0.xml': '<?xml version="1.0" encoding="utf-8"?>...',
'/sitemap-1.xml': '<?xml version="1.0" encoding="utf-8"?>...'
}
```

## License

MIT. Copyright (c) [Feross Aboukhadijeh](https://feross.org).
137 index.js
@@ -0,0 +1,137 @@
module.exports = expressSitemapXml
module.exports.buildSitemaps = buildSitemaps

const builder = require('xmlbuilder')
const mem = require('mem')
const { URL } = require('url') // TODO: Remove once Node 8 support is dropped

const MAX_SITEMAP_LENGTH = 50 * 1000 // Max URLs in a sitemap (defined by spec)
const SITEMAP_URL_RE = /\/sitemap(-\d+)?\.xml/ // Sitemap url pattern
const SITEMAP_MAX_AGE = 24 * 60 * 60 * 1000 // Cache sitemaps for 24 hours

function expressSitemapXml (loadUrls, base) {
if (typeof loadUrls !== 'function') {
throw new Error('Argument `loadUrls` must be a function')
}
if (typeof base !== 'string') {
throw new Error('Argument `base` must be a string')
}

async function loadSitemaps () {
const urls = await loadUrls()
if (!Array.isArray(urls)) {
throw new Error('async function `loadUrls` must resolve to an Array')
}
return buildSitemaps(urls, base)
}

const memoizedLoad = mem(loadSitemaps, { maxAge: SITEMAP_MAX_AGE })

return async (req, res, next) => {
const isSitemapUrl = SITEMAP_URL_RE.test(req.url)
if (isSitemapUrl) {
const sitemaps = await memoizedLoad()
if (sitemaps[req.url]) {
return res.status(200).send(sitemaps[req.url])
}
}
next()
}
}

async function buildSitemaps (urls, base) {
const sitemaps = Object.create(null)

if (urls.length <= MAX_SITEMAP_LENGTH) {
// If there is only one sitemap (i.e. there are less than 50,000 URLs)
// then serve it directly at /sitemap.xml
sitemaps['/sitemap.xml'] = buildSitemap(urls, base)
} else {
// Otherwise, serve a sitemap index at /sitemap.xml and sitemaps at
// /sitemap-0.xml, /sitemap-1.xml, etc.
for (let i = 0; i * MAX_SITEMAP_LENGTH < urls.length; i++) {
const start = i * MAX_SITEMAP_LENGTH
const selectedUrls = urls.slice(start, start + MAX_SITEMAP_LENGTH)
sitemaps[`/sitemap-${i}.xml`] = buildSitemap(selectedUrls, base)
}
sitemaps['/sitemap.xml'] = buildSitemapIndex(sitemaps, base)
}

return sitemaps
}

function buildSitemapIndex (sitemaps, base) {
const sitemapObjs = Object.keys(sitemaps).map((sitemapUrl, i) => {
return {
loc: toAbsolute(sitemapUrl, base),
lastmod: getTodayStr()
}
})

const sitemapIndexObj = {
sitemapindex: {
'@xmlns': 'http://www.sitemaps.org/schemas/sitemap/0.9',
sitemap: sitemapObjs
}
}

return buildXml(sitemapIndexObj)
}

function buildSitemap (urls, base) {
const urlObjs = urls.map(url => {
if (typeof url === 'string') {
return {
loc: toAbsolute(url, base),
lastmod: getTodayStr()
}
}

if (typeof url.url !== 'string') {
throw new Error(
`Invalid sitemap url object, missing 'url' property: ${JSON.stringify(url)}`
)
}

const urlObj = {
loc: toAbsolute(url.url, base),
lastmod: (url.lastMod && dateToString(url.lastMod)) || getTodayStr()
}
if (typeof url.changeFreq === 'string') {
urlObj.changefreq = url.changeFreq
}
return urlObj
})

const sitemapObj = {
urlset: {
'@xmlns': 'http://www.sitemaps.org/schemas/sitemap/0.9',
url: urlObjs
}
}

return buildXml(sitemapObj)
}

function buildXml (obj) {
const opts = {
encoding: 'utf-8',
skipNullAttributes: true,
skipNullNodes: true
}
const xml = builder.create(obj, opts)
return xml.end({ pretty: true, allowEmpty: false })
}

function getTodayStr () {
return dateToString(new Date())
}

function dateToString (date) {
if (typeof date === 'string') return date
return date.toISOString().split('T')[0]
}

function toAbsolute (url, base) {
return new URL(url, base).href
}
@@ -0,0 +1,46 @@
{
"name": "express-sitemap-xml",
"description": "Serve sitemap.xml from a list of URLs in Express",
"version": "0.0.0",
"author": {
"name": "Feross Aboukhadijeh",
"email": "feross@feross.org",
"url": "https://feross.org"
},
"bugs": {
"url": "https://github.com/feross/express-sitemap-xml/issues"
},
"dependencies": {
"mem": "^3.0.1",
"xmlbuilder": "^10.0.0"
},
"devDependencies": {
"common-tags": "^1.8.0",
"standard": "*",
"tape": "^4.9.1"
},
"homepage": "https://github.com/feross/express-sitemap-xml",
"keywords": [
"express",
"google",
"serve sitemap",
"serve sitemap.xml",
"site map",
"site map xml",
"sitemap",
"sitemap generator",
"sitemap xml",
"sitemap.xml",
"sitemaps",
"xml"
],
"license": "MIT",
"main": "index.js",
"repository": {
"type": "git",
"url": "git://github.com/feross/express-sitemap-xml.git"
},
"scripts": {
"test": "standard && tape test/**/*.js"
}
}

0 comments on commit ebf1030

Please sign in to comment.
You can’t perform that action at this time.