Skip to content

Commit

Permalink
dma-mapping: replace DMA_ATTR_NON_CONSISTENT with dma_{alloc,free}_pages
Browse files Browse the repository at this point in the history
Add a new API to allocate and free pages that are guaranteed to be
addressable by a device, but otherwise behave like pages allocated by
alloc_pages.  The intended APIs to sync them for use with the device
and cpu are dma_sync_single_for_{device,cpu} that are also used for
streaming mappings.

Switch all drivers over to this new API, but keep the usage of the
crufty dma_cache_sync API for now, which will be cleaned up on a driver
by driver basis.

Signed-off-by: Christoph Hellwig <hch@lst.de>
  • Loading branch information
Christoph Hellwig authored and intel-lab-lkp committed Aug 19, 2020
1 parent 2161114 commit b66e4b7
Show file tree
Hide file tree
Showing 29 changed files with 254 additions and 96 deletions.
68 changes: 31 additions & 37 deletions Documentation/core-api/dma-api.rst
Expand Up @@ -516,48 +516,53 @@ routines, e.g.:::
}


Part II - Advanced dma usage
----------------------------
Part II - Non-coherent DMA allocations
--------------------------------------

Warning: These pieces of the DMA API should not be used in the
majority of cases, since they cater for unlikely corner cases that
don't belong in usual drivers.
These APIs allow to allocate pages that can be used like normal pages
in the kernel direct mapping, but are guaranteed to be DMA addressable.

If you don't understand how cache line coherency works between a
processor and an I/O device, you should not be using this part of the
API at all.
API.

::

void *
dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
gfp_t flag, unsigned long attrs)
dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle,
enum dma_data_direction dir, gfp_t gfp)

This routine allocates a region of <size> bytes of consistent memory. It
returns a pointer to the allocated region (in the processor's virtual address
space) or NULL if the allocation failed. The returned memory is guanteed to
behave like memory allocated using alloc_pages.

It also returns a <dma_handle> which may be cast to an unsigned integer the
same width as the bus and given to the device as the DMA address base of
the region.

Identical to dma_alloc_coherent() except that when the
DMA_ATTR_NON_CONSISTENT flags is passed in the attrs argument, the
platform will choose to return either consistent or non-consistent memory
as it sees fit. By using this API, you are guaranteeing to the platform
that you have all the correct and necessary sync points for this memory
in the driver should it choose to return non-consistent memory.
The dir parameter specified if data is read and/or written by the device,
see dma_map_single() for details.

Note: where the platform can return consistent memory, it will
guarantee that the sync points become nops.
The gfp parameter allows the caller to specify the ``GFP_`` flags (see
kmalloc()) for the allocation, but rejects flags used to specify a memory
zone such as GFP_DMA or GFP_HIGHMEM.

Warning: Handling non-consistent memory is a real pain. You should
only use this API if you positively know your driver will be
required to work on one of the rare (usually non-PCI) architectures
that simply cannot make consistent memory.
Before giving the memory to the device, dma_sync_single_for_device() needs
to be called, and before reading memory written by the device,
dma_sync_single_for_cpu(), just like for streaming DMA mappings that are
reused.

::

void
dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t dma_handle, unsigned long attrs)
dma_free_pages(struct device *dev, size_t size, void *cpu_addr,
dma_addr_t dma_handle, enum dma_data_direction dir)

Free memory allocated by the dma_alloc_attrs(). All common
parameters must be identical to those otherwise passed to dma_free_coherent,
and the attrs argument must be identical to the attrs passed to
dma_alloc_attrs().
Free a region of memory previously allocated using dma_alloc_pages(). dev,
size and dma_handle and dir must all be the same as those passed into
dma_alloc_pages(). cpu_addr must be the virtual address returned by
the dma_alloc_pages().

::

Expand All @@ -575,17 +580,6 @@ memory or doing partial flushes.
into the width returned by this call. It will also always be a power
of two for easy alignment.

::

void
dma_cache_sync(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction direction)

Do a partial sync of memory that was allocated by dma_alloc_attrs() with
the DMA_ATTR_NON_CONSISTENT flag starting at virtual address vaddr and
continuing on for size. Again, you *must* observe the cache line
boundaries when doing this.


Part III - Debug drivers use of the DMA-API
-------------------------------------------
Expand Down
8 changes: 0 additions & 8 deletions Documentation/core-api/dma-attributes.rst
Expand Up @@ -25,14 +25,6 @@ Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE,
those that do not will simply ignore the attribute and exhibit default
behavior.

DMA_ATTR_NON_CONSISTENT
-----------------------

DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
consistent or non-consistent memory as it sees fit. By using this API,
you are guaranteeing to the platform that you have all the correct and
necessary sync points for this memory in the driver.

DMA_ATTR_NO_KERNEL_MAPPING
--------------------------

Expand Down
2 changes: 2 additions & 0 deletions arch/alpha/kernel/pci_iommu.c
Expand Up @@ -957,5 +957,7 @@ const struct dma_map_ops alpha_pci_ops = {
.dma_supported = alpha_pci_supported,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};
EXPORT_SYMBOL(alpha_pci_ops);
2 changes: 2 additions & 0 deletions arch/arm/mm/dma-mapping-nommu.c
Expand Up @@ -176,6 +176,8 @@ static void arm_nommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist
const struct dma_map_ops arm_nommu_dma_ops = {
.alloc = arm_nommu_dma_alloc,
.free = arm_nommu_dma_free,
.alloc_pages = dma_direct_alloc_pages,
.free_pages = dma_direct_free_pages,
.mmap = arm_nommu_dma_mmap,
.map_page = arm_nommu_dma_map_page,
.unmap_page = arm_nommu_dma_unmap_page,
Expand Down
4 changes: 4 additions & 0 deletions arch/arm/mm/dma-mapping.c
Expand Up @@ -199,6 +199,8 @@ static int arm_dma_supported(struct device *dev, u64 mask)
const struct dma_map_ops arm_dma_ops = {
.alloc = arm_dma_alloc,
.free = arm_dma_free,
.alloc_pages = dma_direct_alloc_pages,
.free_pages = dma_direct_free_pages,
.mmap = arm_dma_mmap,
.get_sgtable = arm_dma_get_sgtable,
.map_page = arm_dma_map_page,
Expand Down Expand Up @@ -226,6 +228,8 @@ static int arm_coherent_dma_mmap(struct device *dev, struct vm_area_struct *vma,
const struct dma_map_ops arm_coherent_dma_ops = {
.alloc = arm_coherent_dma_alloc,
.free = arm_coherent_dma_free,
.alloc_pages = dma_direct_alloc_pages,
.free_pages = dma_direct_free_pages,
.mmap = arm_coherent_dma_mmap,
.get_sgtable = arm_dma_get_sgtable,
.map_page = arm_coherent_dma_map_page,
Expand Down
2 changes: 2 additions & 0 deletions arch/ia64/hp/common/sba_iommu.c
Expand Up @@ -2071,6 +2071,8 @@ static const struct dma_map_ops sba_dma_ops = {
.dma_supported = sba_dma_supported,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};

static int __init
Expand Down
7 changes: 2 additions & 5 deletions arch/mips/jazz/jazzdma.c
Expand Up @@ -505,9 +505,6 @@ static void *jazz_dma_alloc(struct device *dev, size_t size,
*dma_handle = vdma_alloc(virt_to_phys(ret), size);
if (*dma_handle == DMA_MAPPING_ERROR)
goto out_free_pages;

if (attrs & DMA_ATTR_NON_CONSISTENT)
return ret;
arch_dma_prep_coherent(page, size);
return (void *)(UNCAC_BASE + __pa(ret));

Expand All @@ -520,8 +517,6 @@ static void jazz_dma_free(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle, unsigned long attrs)
{
vdma_free(dma_handle);
if (!(attrs & DMA_ATTR_NON_CONSISTENT))
vaddr = __va(vaddr - UNCAC_BASE);
__free_pages(virt_to_page(vaddr), get_order(size));
}

Expand Down Expand Up @@ -622,5 +617,7 @@ const struct dma_map_ops jazz_dma_ops = {
.cache_sync = arch_dma_cache_sync,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};
EXPORT_SYMBOL(jazz_dma_ops);
2 changes: 2 additions & 0 deletions arch/powerpc/kernel/dma-iommu.c
Expand Up @@ -137,4 +137,6 @@ const struct dma_map_ops dma_iommu_ops = {
.get_required_mask = dma_iommu_get_required_mask,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};
4 changes: 4 additions & 0 deletions arch/powerpc/platforms/ps3/system-bus.c
Expand Up @@ -696,6 +696,8 @@ static const struct dma_map_ops ps3_sb_dma_ops = {
.unmap_page = ps3_unmap_page,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};

static const struct dma_map_ops ps3_ioc0_dma_ops = {
Expand All @@ -708,6 +710,8 @@ static const struct dma_map_ops ps3_ioc0_dma_ops = {
.unmap_page = ps3_unmap_page,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};

/**
Expand Down
2 changes: 2 additions & 0 deletions arch/powerpc/platforms/pseries/vio.c
Expand Up @@ -608,6 +608,8 @@ static const struct dma_map_ops vio_dma_mapping_ops = {
.get_required_mask = dma_iommu_get_required_mask,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};

/**
Expand Down
2 changes: 2 additions & 0 deletions arch/s390/pci/pci_dma.c
Expand Up @@ -670,6 +670,8 @@ const struct dma_map_ops s390_pci_dma_ops = {
.unmap_page = s390_dma_unmap_pages,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
/* dma_supported is unconditionally true without a callback */
};
EXPORT_SYMBOL_GPL(s390_pci_dma_ops);
Expand Down
2 changes: 2 additions & 0 deletions arch/x86/kernel/amd_gart_64.c
Expand Up @@ -678,6 +678,8 @@ static const struct dma_map_ops gart_dma_ops = {
.get_sgtable = dma_common_get_sgtable,
.dma_supported = dma_direct_supported,
.get_required_mask = dma_direct_get_required_mask,
.alloc_pages = dma_direct_alloc_pages,
.free_pages = dma_direct_free_pages,
};

static void gart_iommu_shutdown(void)
Expand Down
2 changes: 2 additions & 0 deletions drivers/iommu/dma-iommu.c
Expand Up @@ -1120,6 +1120,8 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
static const struct dma_map_ops iommu_dma_ops = {
.alloc = iommu_dma_alloc,
.free = iommu_dma_free,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
.mmap = iommu_dma_mmap,
.get_sgtable = iommu_dma_get_sgtable,
.map_page = iommu_dma_map_page,
Expand Down
4 changes: 4 additions & 0 deletions drivers/iommu/intel/iommu.c
Expand Up @@ -3669,6 +3669,8 @@ static const struct dma_map_ops intel_dma_ops = {
.dma_supported = dma_direct_supported,
.mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
.get_required_mask = intel_get_required_mask,
};

Expand Down Expand Up @@ -3922,6 +3924,8 @@ static const struct dma_map_ops bounce_dma_ops = {
.sync_sg_for_device = bounce_sync_sg_for_device,
.map_resource = bounce_map_resource,
.unmap_resource = bounce_unmap_resource,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
.dma_supported = dma_direct_supported,
};

Expand Down
13 changes: 6 additions & 7 deletions drivers/net/ethernet/i825xx/lasi_82596.c
Expand Up @@ -184,9 +184,8 @@ lan_init_chip(struct parisc_device *dev)

lp = netdev_priv(netdevice);
lp->options = dev->id.sversion == 0x72 ? OPT_SWAP_PORT : 0;
lp->dma = dma_alloc_attrs(dev->dev.parent, sizeof(struct i596_dma),
&lp->dma_addr, GFP_KERNEL,
DMA_ATTR_NON_CONSISTENT);
lp->dma = dma_alloc_pages(dev->dev.parent, sizeof(struct i596_dma),
&lp->dma_addr, DMA_BIDIRECTIONAL, GFP_KERNEL);
if (!lp->dma)
goto out_free_netdev;

Expand All @@ -196,8 +195,8 @@ lan_init_chip(struct parisc_device *dev)
return 0;

out_free_dma:
dma_free_attrs(dev->dev.parent, sizeof(struct i596_dma),
lp->dma, lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
dma_free_pages(dev->dev.parent, sizeof(struct i596_dma),
lp->dma, lp->dma_addr, DMA_BIDIRECTIONAL);
out_free_netdev:
free_netdev(netdevice);
return retval;
Expand All @@ -209,8 +208,8 @@ static int __exit lan_remove_chip(struct parisc_device *pdev)
struct i596_private *lp = netdev_priv(dev);

unregister_netdev (dev);
dma_free_attrs(&pdev->dev, sizeof(struct i596_private), lp->dma,
lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
dma_free_pages(&pdev->dev, sizeof(struct i596_private), lp->dma,
lp->dma_addr, DMA_BIDIRECTIONAL);
free_netdev (dev);
return 0;
}
Expand Down
12 changes: 6 additions & 6 deletions drivers/net/ethernet/seeq/sgiseeq.c
Expand Up @@ -740,8 +740,8 @@ static int sgiseeq_probe(struct platform_device *pdev)
sp = netdev_priv(dev);

/* Make private data page aligned */
sr = dma_alloc_attrs(&pdev->dev, sizeof(*sp->srings), &sp->srings_dma,
GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
sr = dma_alloc_pages(&pdev->dev, sizeof(*sp->srings), &sp->srings_dma,
DMA_BIDIRECTIONAL, GFP_KERNEL);
if (!sr) {
printk(KERN_ERR "Sgiseeq: Page alloc failed, aborting.\n");
err = -ENOMEM;
Expand Down Expand Up @@ -802,8 +802,8 @@ static int sgiseeq_probe(struct platform_device *pdev)
return 0;

err_out_free_attrs:
dma_free_attrs(&pdev->dev, sizeof(*sp->srings), sp->srings,
sp->srings_dma, DMA_ATTR_NON_CONSISTENT);
dma_free_pages(&pdev->dev, sizeof(*sp->srings), sp->srings,
sp->srings_dma, DMA_BIDIRECTIONAL);
err_out_free_dev:
free_netdev(dev);

Expand All @@ -817,8 +817,8 @@ static int sgiseeq_remove(struct platform_device *pdev)
struct sgiseeq_private *sp = netdev_priv(dev);

unregister_netdev(dev);
dma_free_attrs(&pdev->dev, sizeof(*sp->srings), sp->srings,
sp->srings_dma, DMA_ATTR_NON_CONSISTENT);
dma_free_pages(&pdev->dev, sizeof(*sp->srings), sp->srings,
sp->srings_dma, DMA_BIDIRECTIONAL);
free_netdev(dev);

return 0;
Expand Down
2 changes: 2 additions & 0 deletions drivers/parisc/ccio-dma.c
Expand Up @@ -1025,6 +1025,8 @@ static const struct dma_map_ops ccio_ops = {
.map_sg = ccio_map_sg,
.unmap_sg = ccio_unmap_sg,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};

#ifdef CONFIG_PROC_FS
Expand Down
2 changes: 2 additions & 0 deletions drivers/parisc/sba_iommu.c
Expand Up @@ -1077,6 +1077,8 @@ static const struct dma_map_ops sba_ops = {
.map_sg = sba_map_sg,
.unmap_sg = sba_unmap_sg,
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
};


Expand Down
8 changes: 4 additions & 4 deletions drivers/scsi/53c700.c
Expand Up @@ -300,8 +300,8 @@ NCR_700_detect(struct scsi_host_template *tpnt,
memory = dma_alloc_coherent(dev, TOTAL_MEM_SIZE, &pScript, GFP_KERNEL);
if (!memory) {
hostdata->noncoherent = 1;
memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript,
GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
memory = dma_alloc_pages(dev, TOTAL_MEM_SIZE, &pScript,
GFP_KERNEL, DMA_BIDIRECTIONAL);
}
if (!memory) {
printk(KERN_ERR "53c700: Failed to allocate memory for driver, detaching\n");
Expand Down Expand Up @@ -414,8 +414,8 @@ NCR_700_release(struct Scsi_Host *host)
(struct NCR_700_Host_Parameters *)host->hostdata[0];

if (hostdata->noncoherent)
dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script,
hostdata->pScript, DMA_ATTR_NON_CONSISTENT);
dma_free_pages(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script,
hostdata->pScript, DMA_BIDIRECTIONAL);
else
dma_free_coherent(hostdata->dev, TOTAL_MEM_SIZE,
hostdata->script, hostdata->pScript);
Expand Down

0 comments on commit b66e4b7

Please sign in to comment.