Skip to content

Latest commit

 

History

History
1508 lines (1119 loc) · 65.7 KB

theory.rst

File metadata and controls

1508 lines (1119 loc) · 65.7 KB

Background theory

Introduction

The EM1DTM and EM1DTMFWD programs are designed to interpret time-domain, small loop, electromagnetic data over a 1D layered Earth. These programs model the Earth's time-domain electromagnetic response due to a small inductive loop source which carries a sinusoidal time-varying current. The data are the secondary magnetic field which results from currents and magnetization induced in the Earth.

Details regarding the source and receiver

EM1DTM and EM1DTMFWD assume that the transmitter loop is horizontal. The programs also assume that each receiver loop is sufficiently small and that they may be considered point receivers; i.e. the spatial variation in magnetic flux through the receiver loop is negligible.

Details regarding the domain

Layered 1D model describing the Earth for each sounding.

Layered 1D model describing the Earth for each sounding.

EM1DTM and EM1DTMFWD model the Earth's response for measurements above a stack of uniform, horizontal layers. The coordinate system used for describing the Earth models has z as positive downwards, with the surface of the Earth at z=0. The z-coordinates of the source and receiver locations (which must be above the surface) are therefore negative. Each layer is defined by a thickness and an electrical conductivity. It is the physical properties of the layers that are obtained during the inversion of all measurements gathered at a single location, with the depths to the layer interfaces remaining fixed. If measurements made at multiple locations are being interpreted, the corresponding one-dimensional models are juxtaposed to create a two-dimensional image of the subsurface.

All measurements that are to be inverted for a single one- dimensional model must be grouped together as a “sounding”. A sounding can be considered a distinct collection of TEM measurements (receivers and times) for a given transmitter. Each different sounding is inverted for a separate one-dimensional model.

Two distinct groupings of transmitters and receivers (soundings) at the same location (left). Different soundings used to map lateral variation in the Earth (right). Click to enlarge.

Two distinct groupings of transmitters and receivers (soundings) at the same location (left). Different soundings used to map lateral variation in the Earth (right). Click to enlarge.

The electrical conductivity of Earth materials varies over several orders of magnitude, making it more natural to invert for the logarithms of the layer conductivities rather than the conductivities themselves. This also ensures that conductivities in the constructed model are positive.

Details regarding the data

For a horizontal loop source, EM1DTM and EM1DTMFWD can handle any combination of:

  • times for the transmitter current
  • separations and heights for the transmitter and receiver(s)
  • B-field or time varying (dBdt) response
  • x, y and z-components of the response

Programs EM1DTM and EM1DTMFWD accept observations in two different forms: values of the secondary magnetic field H-field in A/m and values of the total H-field in A/m. If the transmitter and receiver have the same orientation, the x-, y- and z-components of the secondary field are normalized by the x-, y- and z-components of the free-space field respectively. If the transmitter and receiver have different orientations, the secondary field is normalized by the magnitude of the free-space field.

Forward Modeling

The method used to compute the magnetic field values for a particular source-receiver arrangement over a layered Earth model is the matrix propagation approach described in Farquharson (Farquharson2003). The method uses the z-component of the Schelkunoff F-potential (Ward1987): layered Earth model is the matrix propagation approach described in Farquharson & Oldenburg (1996) and Farquharson, Oldenburg & Routh (2003). Computations are done in the frequency domain, then the fields transformed to the time domain.

This section covers the aspects of the forward-modelling procedure for the three components of the magnetic field above a horizontally-layered Earth model for a horizontal, many-sided transmitter loop also above the surface of the model. The field for the transmitter loop is computed by the superposition of the fields due to horizontal electric dipoles (see eqs.~4.134--4.152 Ward1987). Because the loop is closed, the contributions from the ends of the electric dipole are ignored, and the superposition is carried out only over the TE-mode component. This TE-mode only involves the z-component of the Schelkunoff F-potential, just as for program EM1DFM.

The Schelkunoff F-potential is defined as follows:

$$\begin{aligned} \mathbf{E} = - \nabla \times \mathbf{F} \\\ \mathbf{H} = - \sigma \mathbf{F} + \frac{1}{i \omega \mu }\nabla (\nabla \cdot \mathbf{F}) \end{aligned}$$

where E and H are the electric and mangetic fields, σ and μ are the conductivity and permability of the uniform region to which the above queation refer, and a time-dependence of eiωt has been assumed.

In the jth layer (j > 0), with conductivity σj , and permeability µj, the z-component of the Schelkunoff potential satisfies the equation (assuming the quasi-static approximation, and the permeability of the layer is equal to that of free space μ0):


2Fj − iωμ0σjFj = 0

The propagation of F through the stack of layers therefore happens in exactly the same way, and so is not repeated here (see EM1DFM Forward modeling section <http://em1dfm.readthedocs.io/en/latest/content/theory.html#forward- modeling>).

Note

Assumptions - eiωt time-dependence (just as in Ward1987) - quasi-static approximation throughout - z positive downwards - air halfspace (σ = 0) for z<0, piecewise constant model (σ > 0) of N layers for zge0, Nth layer being the basement (i.e.~homogeneous) halfspace - magnetic permeability everywhere equal to that of free space.

From the propagation of F through the layers gives the following expression for the kernel of the Hankel transform, , in the air halfspace (z < 0):

$$\tilde{F}_0\;=D_0^S\Big(e^{-u_0z}\;+\;{P_{21}\over P_{11}}e^{u_0z}\Big)$$

(same as here in EM1DFM).

For a horizontal x-directed electric dipole at a height h (i.e., z =  − h, h > 0) above the surface of the layered Earth, the downward-decaying part of the primary solution of (and the only downward-decaying part of the solution in the air halfspace) at the surface of the Earth (z = 0) is given by

$$D_0^S\;=\;-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\>e^{-u_0h}$$

from Ward1987, equation (4.137). Substituting A-2 into A-1 gives

$$\tilde{F}_0\;=-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0(z+h)}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big).$$

Generalizing this expression for z above (z <  − h) as well as below the source (z >  − h):

$$\tilde{F}_0\;=-\,{i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big).$$

Applying the inverse Fourier transform to A-4 gives

$$F_0(x,y,z,\omega)\;=\;-\,{1\over4\pi^2}\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y$$

(equation (4.139) of Ward1987). Using the identity


(kx2 + ky2) dkxdky = 2π0(λ) λJ0(λr) dλ,

(Ward1987, equation (2.10)) where λ2 = kx2 + ky2 and r2 = x2 + y2, A-5 can be rewritten as

$$F_0(x,y,z,\omega)\;=\;-\,{1\over2\pi}\,{\partial\over\partial y}\,\int_0^{\infty} {i\omega\mu_0\over 2u_0}\>{1\over\lambda^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, \lambda\,J_0(\lambda r)\,d\lambda,$$

$$=\;-\,{i\omega\mu_0\over4\pi}\,{\partial\over\partial y}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda^2}\,J_0(\lambda r)\,d\lambda,$$

$$=\;{i\omega\mu_0\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda}\,J_1(\lambda r)\,d\lambda$$

since

$${\partial J_0(\lambda r)\over\partial y}\;=\;-\,\lambda{y\over r}\,J_1(\lambda r)$$

(Ward1987, equation 4.44 (almost)).

The H-field in the air halfspace can be obtained from A-9 (or A-8) by using equation (1.130) of Ward1987:

$$H_x\;=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial x\partial z},$$

$$H_y\;=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial y\partial z},$$

$$H_z\;=\;{1\over i\omega\mu_0}\,\Big({\partial^2\over\partial z^2}\,+\,\kappa_0^2\Big) \,F_0$$

$$=\;{1\over i\omega\mu_0}\,{\partial^2F_0\over\partial z^2}.$$

since κ02 = 0. Applying A-11 to A-9 gives

$$H_x(x,y,z,\omega)\;=\;{1\over4\pi}\,{\partial\over\partial x}\,{y\over r}\,\int_0^{\infty} \,\Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_1(\lambda r)\,d\lambda.$$

(The plus/minus is to do with whether or not the observation location is above or below the source. In the program, it perhaps it is only the secondary fields that are computed using the above expressions: the primary field, which corresponds to the first term in each Hankel transform kernel above is computed using its for in (x, y, z)-space.) When the above expression for a horizontal electric dipole is integrated along a wire all that is left is the effects of the endpoints. These will cancel when integrating around the closed loop. So as far as the part of Hx that contributes to the file due to a closed loop:


Hx(x, y, z, ω) = 0.

For the y-component of the H-field, first consider differentiating the expression for F0 in A-5 with respect to y:

$${\partial F_0\over\partial y}\;=\;-\,{1\over4\pi^2}\,{\partial\over\partial y}\, \int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{ik_y\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

$$=\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{k_y^2\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

$$=\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y$$

$$\qquad-\;{1\over4\pi^2}\,\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} {i\omega\mu_0\over 2u_0}\>{k_x^2\over k_x^2+k_y^2}\> \Big(e^{-u_0|z+h|}\;+\;{P_{21}\over P_{11}}e^{u_0(z-h)}\Big)\, e^{i(k_xx+k_yy)}\;dk_x\,dk_y,$$

since

$${k_y^2\over k_x^2+k_y^2}\;=\;1\>-\>{k_x^2\over k_x^2+k_y^2}.$$

Converting the kx2 into derivatives with respect to x, and converting the two-dimensional Fourier transforms to Hankel transforms gives

$${\partial F_0\over\partial y}\;=\;{i\omega\mu_0\over4\pi}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad+\;{i\omega\mu_0\over4\pi}\,{\partial^2\over\partial x^2}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda^2}\>J_0(\lambda r)\;d\lambda,$$

$$=\;{i\omega\mu_0\over4\pi}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad-\;{i\omega\mu_0\over4\pi}\,{\partial\over\partial x}\,{x\over r}\,\int_0^{\infty} \Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, {1\over\lambda}\>J_1(\lambda r)\;d\lambda,$$

using equations (4.144) and (4.117) of Ward1987. Differentiating A-22 with respect to z and scaling by iωμ0 (see A-12) gives

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_0(\lambda r)\;d\lambda$$

$$\qquad\qquad-\;{1\over4\pi}\,{\partial\over\partial x}\,{x\over r}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, J_1(\lambda r)\;d\lambda$$

(equation 4.150 of Ward1987). The second integral in the above expression only contributes at the ends of the dipole. So the only part of Hy required to compute the field due to the closed loop is

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \Big(\pm e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_0(\lambda r)\;d\lambda.$$

Finally, applying A-14 to A-9 gives the z-component of the H-field:

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_1(\lambda r)\,d\lambda$$

(equation (4.152) of Ward1987).

A-24 and A-25 are for the total H-field (Hx = 0 from A-16 for an x-directed electric dipole excluding the effects at the end-points, that is, the wholespace field up in the air plus the field due to currents induced in the layered Earth. In A-24 and A-25, the first part of the kernel of the Hankel transform corresponds to the primary wholespace field:

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,\int_0^{\infty} \pm\,e^{-\lambda|z+h|}\, \lambda\,J_0(\lambda r)\;d\lambda,$$

$$=\;{1\over4\pi}\,{\partial\over\partial z}\,\int_0^{\infty} e^{-\lambda|z+h|}\, J_0(\lambda r)\;d\lambda,$$

and

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} e^{-\lambda|z+h|}\, \lambda\,J_1(\lambda r)\,d\lambda$$

$$=\;-\,{1\over4\pi}\,{y\over r}\,{\partial\over\partial r}\,\int_0^{\infty} e^{-\lambda|z+h|}\, J_0(\lambda r)\,d\lambda.$$

From Ward1987 equation (4.53), the integral in the above two expressions can be done:

$$\int_0^{\infty}e^{-\lambda|z+h|}\,J_0(\lambda r)\,d\lambda\;=\; {1\over\big(r^2+(z+h)^2\big)^{1/2}}.$$

So

$$H_y(x,y,z,\omega)\;=\;{1\over4\pi}\,{\partial\over\partial z}\, {1\over\big(r^2+(z+h)^2\big)^{1/2}},$$

$$=\;-\,{1\over4\pi}\, {z\over\big(r^2+(z+h)^2\big)^{3/2}}$$

(equation (2.42) of Ward1987 for σ = 0), and

$$H_z(x,y,z,\omega)\;=\;-\,{1\over4\pi}\,{y\over r}\,{\partial\over\partial r}\, {1\over\big(r^2+(z+h)^2\big)^{1/2}},$$

$$=\;{1\over4\pi}\,{y\over r}\, {r\over\big(r^2+(z+h)^2\big)^{3/2}},$$

$$=\;{1\over4\pi}\, {y\over\big(r^2+(z+h)^2\big)^{3/2}}$$

(equation (2.42) of Ward1987 for σ = 0).

Frequency- to time-domain transformation -- Part I

The solution for the H-field in the frequency domain for a delta-function source in time (and hence a flat, constant, real source term in the frequency domain) is, for example,

$$H_z(x,y,z,\omega)\;=\;{1\over4\pi}\,{y\over r}\,\int_0^{\infty} \,\Big(e^{-\lambda|z+h|}\;+\;{P_{21}\over P_{11}}e^{\lambda(z-h)}\Big)\, \lambda\,J_1(\lambda r)\,d\lambda.$$

Doing the inverse Fourier transform of these kinds of expressions does not encounter any subtleties, and gives an H-field as a function of time that, schematically, looks like:

S(t) = δ(t)  Gh(t) 

This is the basic response that program EM1DTM computes. Notation of Gh(t) because this is the Green's function for convolution with the transmitter current waveform S(t) to give the H-field:


h(t) = ∫t =  − ∞Gh(t − t) S(tdt.

The H-field for the delta-function source, that is, Gh certainly exists for t > 0. Also, it is certainly zero for t < 0. And at t = 0, it certainly is not infinite (not physical). Let's re-describe the function Gh (shown in the diagram above) as


Gh(t) = X(t) h(t),

where h(t) is equal to Gh for t > 0, h(0) = limt → 0+Gh, and does anything it wants for t < 0. And X(t) is the Heaviside function. This moves all issues about what is happening at t = 0 into the Heaviside function.

For measurements of voltage, the Green's function (impulse response) that is required is the time derivative of Gh (and for all t, not just t > 0). Schematically:

S(t) = δ(t)  GV(t) 

In terms of math:


V(t) = ∫t =  − ∞GV(t − t) S(tdt.

Let's take the time derivative of AA-2 to get the full expression for GV:

$$\begin{aligned} G^V(t)\;&=\;{dG^h\over dt}, \\\ &=\;{d\over dt}\big(X\,\tilde{G}^h\big)\\\ &=\;X\,{d\tilde{G}^h\over dt}\;+\;\delta\,\tilde{G}^h, \end{aligned}$$

where δ is the delta function. Now, this is not a time derivative that should be happening numerically. So, given the basic Gh(t) and some representation of the transmitter current waveform S(t), program EM1DTM currently uses the re-arrangement of AA-3 given by the substitution of AA-4 into AA-3 followed by some integration by parts:

$$\begin{aligned} V(t)\;&=\;\int_{t^{\prime}=-\infty}^{\infty} \Big\{X(t-t^{\prime})\,{d\tilde{G}^h\over dt^{\prime}}(t-t^{\prime})\;+\; \delta(t-t^{\prime})\,\tilde{G}^h(t-t^{\prime})\Big\} \,S(t^{\prime})\>dt^{\prime},\\\ &=\;\tilde{G}^h(0)\,S(t)\;+\; \int_{t^{\prime}=-\infty}^t{d\tilde{G}^h\over dt^{\prime}}(t-t^{\prime})\,S(t^{\prime})\>dt^{\prime}, \end{aligned}$$

where the Heaviside function has been used to restrict the limits of the integration. Now doing the integration by parts:

$$\begin{aligned} V(t)\;&=\;\tilde{G}^h(0)\,S(t)\;+\; \Big[\tilde{G}^h(t-t^{\prime})\,S(t^{\prime})\Big]_{t^{\prime}=-\infty}^t\;-\; \int_{t^{\prime}=-\infty}^t\tilde{G}^h(t-t^{\prime})\,{dS\over dt^{\prime}}(t^{\prime})\>dt^{\prime} \\\ &=\;\tilde{G}^h(0)\,S(t)\;+\; \tilde{G}^h(0)\,S(t)\;-\; \int_{t^{\prime}=-\infty}^t\tilde{G}^h(t-t^{\prime})\,{dS\over dt^{\prime}}(t^{\prime})\>dt^{\prime}. \end{aligned}$$

Which looks as though it has the expected additional non-convolution-integral term.

However, perhaps there should be an additional minus sign in going from AA--4 to the one before AA-5 because the derivative has changed from d/dt to d/dt. But perhaps not.

Frequency- to time-domain transformation

The Fourier transform that was applied to Maxwell's equations to get the frequency-domain equations was (see Ward1987, equation (1.1))


F(ω) = ∫ − ∞f(te − iωtdt,

and the corresponding inverse transform is

$$f(t)\;=\;{1\over2\pi}\int_{-\infty}^{\infty}F(\omega)\>e^{i\omega t}d\omega.$$

For the frequency domain computations, it is assumed that the source term is the same for all frequencies. In other words, a flat spectrum, which corresponds to a delta-function time-dependence of the source.

Consider at the moment a causal signal, that is, one for which f(t) = 0 for t < 0. The Fourier transform of this signal is then

$$\begin{aligned} F(\omega)\;&=\;\int_0^{\infty}f(t)\>e^{-i\omega t}dt \\\ &=\;\int_0^{\infty}f(t)\>\cos\,\omega t\>dt\;-\;i\,\int_0^{\infty}f(t)\>\sin\,\omega t\>dt. \end{aligned}$$

Note that because of the dependence of the real part of F(ω) on cos  ωt and of the imaginary part on sin  ωt, the real part of F(ω) is even and the imaginary part of F(ω) is odd. Hence, f(t) can be obtained from either the real or imaginary part of its Fourier transform via the inverse cosine or sine transform:

$$\begin{aligned} f(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Re}\,F(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ f(t)\;&=\;-\,{2\over\pi}\int_0^{\infty} {\rm Im}\,F(\omega)\>\sin\,\omega t\>d\omega \end{aligned}$$

(For factor of  2/π see, for example, Arfken.)

Now consider that we've computed the H-field in the frequency domain for a uniform source spectrum. Then from the above expressions, the time-domain H-field for a th positive delta-function} source time-dependence is

$$\begin{aligned} h_{\delta+}(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\delta+}(t)\;&=\;-\,{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega \end{aligned}$$

where H(ω) is the frequency-domain H-field for the uniform source spectrum. For a th negative delta-function} source:

$$\begin{aligned} h_{\delta-}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\delta-}(t)\;&=\;{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega. \end{aligned}$$

The negative delta-function source dependence is the derivative with respect to time of a step turn-off source dependence. Hence, the th derivative} of the time-domain H-field due to a th step turn-off} is also given by the above expressions:

$$\begin{aligned} {\partial h_{\rm s}\over\partial t}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>\cos\,\omega t\>d\omega,\quad{\rm or}\\\ {\partial h_{\rm s}\over\partial t}(t)\;&=\; {2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>\sin\,\omega t\>d\omega. \end{aligned}$$

Integrating the above two expressions gives the H-field for a th step turn-off} source:

$$\begin{aligned} h_{\rm s}(t)\;&=\;h(0)\> -\>{2\over\pi}\int_0^{\infty} {\rm Re}\,H(\omega)\>{1\over\omega}\,\sin\,\omega t\>d\omega,\quad{\rm or}\\\ h_{\rm s}(t)\;&=\; -\,{2\over\pi}\int_0^{\infty} {\rm Im}\,H(\omega)\>{1\over\omega}\,\cos\,\omega t\>d\omega. \end{aligned}$$

(See also Newman, Hohmann and Anderson, and Kaufman and Keller for all this.)

Thinking in terms of the time-domain inhomogeneous differential equation:

$$\begin{aligned} L\,h_{\delta-}\;&=\;\delta_- \\\ \Rightarrow\quad L\,h_{\delta-}\;&=\;{\partial\over\partial t}H_{\rm o} \\\ \Rightarrow\quad L\,{\partial h_s\over\partial t}\;&=\;{\partial\over\partial t}H_{\rm o}. \end{aligned}$$

Fake / equivalent world Real World
 & h(t) $\;\&amp;\;\mathbf{\frac{\partial h}{\partial t}(t)}$
 & h(t) $\;\&amp;\;\mathbf{\frac{\partial h}{\partial t}(t)}$

Top left is what we know (flat frequency spectrum for the source and sine transform of the imaginary part of the field), and top right is what we're after. Also, bottom right is obtained from top left by convolution with the box-car, and bottom right is what we're considering it to be. Note that there should really be some minus signs in the above diagram.

Fake / equivalent world Real World
 & th(t)dt  & h(t)
 & th(t)dt  & h(t)

Again, top left is what we know (flat frequency spectrum for the source and sine transform of the imaginary part of the field), and top right is what we're after. Also, bottom right is obtained from top left by convolution with the box-car, and bottom right is what we're considering it to be. Note that there should really be some minus signs in the above diagram.

Fake / equivalent world Real World
 & h(t) $\;\&amp;\;\mathbf{\frac{\partial h}{\partial t}(t)}$
 & h(t) $\;\&amp;\;\mathbf{\frac{\partial h}{\partial t}(t)}$

Top left is what we have, and right is what we're thinking it is. Bottom left is the convolution with a discretized half-sine, and bottom right is what we're considering it to be: the time-derivative of the H-field for a half-sine waveform.

Integration of cubic splined function

The time-domain voltage or magnetic field ends up being known at a number of discrete, logarithmically/ exponentially-spaced times as a result of Anderson's cosine/sine digital transform. This time-domain function is cubic splined in terms of the logarithms of the times. Hence, between any two discrete times, the time-domain function is approximated by the cubic spline


y(h) = y0 + q1h + q2h2q3h3,

(see routines RSPLN and RSPLE) where h = log x − log ti, x is the time at which the function y is required, ti is the ith time at which y is known (ti ≤ x ≤ ti + 1), y0 = y(log ti), and q1, q2 and q3 are the spline coefficients. The required integral is

$$\begin{aligned} \int_{x=a}^b y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(h+\log t_i)}\,dh \\\ &=\;t_i\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^h\,dh.\cr \end{aligned}$$

Substituting the polynomial expression for y(h) into the above integral and worrying about each term individually gives:


y0eh dh = y0eh,


q1heh dh = q1eh(h − 1)

(G and R 2.322.1),


q2h2eh dh = q2eh(h2 − 2h + 2)

(G and R 2.322.2), and


q3h3eh dh = q3eh(h3 − 3h2 + 6h − 6)

(G and R 2.322.3). Hence, summing the integrals above,

$$\begin{aligned} \int_{x=a}^b y(\log x)\>dx\;=&\;t_i\,y_0\Big({b\over t_i}\,-\,{a\over t_i}\Big)\\\ &+\;t_i\,q_1\Big({b\over t_i}(\log b-\log t_i-1)\>-\>{a\over t_i}(\log a-\log t_i-1)\Big)\\\ &+\;t_i\,q_2\Big({b\over t_i}\big((\log b-\log t_i)^2-2(\log b-\log t_i)+2\big)\>-\\\ &\quad\qquad\qquad{a\over t_i}\big((\log a-\log t_i)^2-2(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i\,q_3\Big({b\over t_i}\big((\log b-\log t_i)^3-3(\log b-\log t_i)^2+6(\log b-\log t_i)-6\big)\>-\\\ &\quad\qquad\qquad{a\over t_i}\big((\log a-\log t_i)^3-3(\log a-\log t_i)^2+6(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

The original plan was to treat a discretised transmitter current waveform as a piecewise linear function (th i.e.), straight line segments between the provided sampled points), which meant that the response coming out of Anderson's filtering routine was convolved with the piecewise constant time-derivative of the transmitter current waveform to give voltages. This proved to be not good enough for on-time calculations (the step-y nature of the approximation of the time derivative of the transmitter current waveform could be seen in the computed voltages). So it was decided to cubic spline the transmitter current waveform, which gives a piecewise quadratic approximation to the time derivative of the waveform. And so the convolution of the stuff coming out of Anderson's routine is now with a constant, a linear time term and a quadratic term. The involved integral above is still required, along with:

$$\begin{aligned} \int_{x=a}^b x\,y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x^2\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{2\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(2h+2\log t_i)}\,dh \\\ &=\;t_i^2\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{2h}\,dh \end{aligned}$$

Using the integrals above for the various powers of h times eh, the relevant integrals for the various parts of the cubic spline representation of y(h) are:

$$\int y_0\,e^{2h}\>dh\;=\;y_0\,{1\over2}\,e^{2h},$$

$$\int q_1 h\,e^{2h}\>dh\;=\;q_1 {1\over4} e^{2h}(2h-1),$$

$$\int q_2 h^2 e^{2h}\>dh\;=\;q_2 {1\over8} e^{2h}(4h^2-4h+2),$$

$$\int q_3 h^3 e^{2h}\>dh\;=\;q_3 {1\over16} e^{2h}(8h^3-12h^2+12h-6).$$

The limits for the integral are h = log a − log ti and h = log b − log ti. The term e2h becomes:

$$\begin{aligned} e^{2(\log X-\log t_i)}\;&=\;\big\{e^{(\log X-\log t_i)}\big\}^2 \\\ &=\;\bigg\{{e^{\log X}\over e^{\log t_i}}\bigg\}^2 \\\ &=\;\bigg({X\over t_i}\bigg)^2 \\\ &=\;{X^2\over t_i^2} \end{aligned}$$

where X is either a or b. Hence,

$$\begin{aligned} \int_{x=a}^b x\,y(\log x)\>dx\;=&\;t_i^2\,y_0\Big({b^2\over t_i^2}\,-\,{a^2\over t_i^2}\Big)\\\ &+\;t_i^2\,q_1\,{1\over 4}\Big({b^2\over t_i^2}(2\log b-2\log t_i-1)\>-\>{a^2\over t_i^2}(2\log a-2\log t_i-1)\Big)\\\ &+\;t_i^2\,q_2\,{1\over 8}\Big({b^2\over t_i^2}\big(4(\log b-\log t_i)^2-4(\log b-\log t_i)+2\big)\>-\\\ &\qquad\qquad\qquad{a^2\over t_i^2}\big(4(\log a-\log t_i)^2-4(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i^2\,q_3\,{1\over 16}\Big({b^2\over t_i^2}\big(8(\log b-\log t_i)^3-12(\log b-\log t_i)^2+12(\log b-\log t_i)-6\big)\>-\\\ &\qquad\qquad\qquad{a^2\over t_i^2}\big(8(\log a-\log t_i)^3-12(\log a-\log t_i)^2+12(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

And

$$\begin{aligned} \int_{x=a}^b x^2\,y(\log x)\>dx\;&=\;\int_{\log x=\log a}^{\log b}y(\log x)\,x^3\,d(\log x) \\\ &=\;\int_{\log x=\log a}^{\log b}y(\log x)\,e^{3\log x}\,d(\log x) \\\ &=\;\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{(3h+3\log t_i)}\,dh \\\ &=\;t_i^3\,\int_{h=\log a-\log t_i}^{\log b-\log t_i}y(h)\,e^{3h}\,dh. \end{aligned}$$

And

$$\int y_0\,e^{3h}\>dh\;=\;y_0\,{1\over3}\,e^{3h},$$

$$\int q_1 h\,e^{3h}\>dh\;=\;q_1 {1\over9} e^{3h}(3h-1),$$

$$\int q_2 h^2 e^{3h}\>dh\;=\;q_2 {1\over27} e^{3h}(9h^2-6h+2),$$

$$\int q_3 h^3 e^{3h}\>dh\;=\;q_3 {1\over81} e^{3h}(27h^3-27h^2+18h-6).$$

Hence,

$$\begin{aligned} \int_{x=a}^b x^2\,y(\log x)\>dx\;=&\;t_i^3\,y_0\Big({b^3\over t_i^3}\,-\,{a^3\over t_i^3}\Big)\\\ &+\;t_i^3\,q_1\,{1\over 9}\Big({b^3\over t_i^3}(3\log b-3\log t_i-1)\>-\>{a^3\over t_i^3}(3\log a-3\log t_i-1)\Big)\\\ &+\;t_i^3\,q_2\,{1\over 27}\Big({b^3\over t_i^3}\big(9(\log b-\log t_i)^2-6(\log b-\log t_i)+2\big)\>-\\\ &\qquad\qquad\qquad{a^3\over t_i^3}\big(9(\log a-\log t_i)^2-6(\log a-\log t_i)+2\big)\Big)\\\ &+\;t_i^3\,q_3\,{1\over 81}\Big({b^3\over t_i^3}\big(27(\log b-\log t_i)^3-27(\log b-\log t_i)^2+18(\log b-\log t_i)-6\big)\>-\\\ &\qquad\qquad\qquad{a^3\over t_i^3}\big(27(\log a-\log t_i)^3-27(\log a-\log t_i)^2+18(\log a-\log t_i)-6\big)\Big). \end{aligned}$$

bigskip In the previous two integrals of the product of x and x2 with the function splined in terms of log x, the x and x2 should really be (B − x) and (B − x)2, where B is the end of the relevant interval of the splined transmitter current waveform (because it's convolution that's happening):


I1 = ∫x = ab(B − xy(log xdx, 

and


I2 = ∫x = ab(B − x)2y(log xdx.

Also, it was not really x and x2 in those integrals because these terms are coming from the cubic splining of the transmitter current waveform, which means that in each interval between discretization points, it should be (x − A) and (x − A)2 that are involved, where A is the start of the relevant interval for the transmitter current waveform. Because

$$\begin{aligned} &\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx\;=\; -\,A\,\int_{x=a}^by(\log x)\>dx\;+\;\int_{x=a}^bx\,y(\log x)\>dx,\quad\hbox{and}\\\ &\int_{x=a}^b\big(x-A\big)^2\>y(\log x)\>dx\;=\; A^2\,\int_{x=a}^by(\log x)\>dx\;-\;2A\,\int_{x=a}^bx\,y(\log x)\>dx\;+\;\int_{x=a}^bx^2\,y(\log x)\>dx \end{aligned}$$

and

$$\begin{aligned} &\int_{x=a}^b\big(B-x\big)\>y(\log x)\>dx\;=\; B\,\int_{x=a}^by(\log x)\>dx\;-\;\int_{x=a}^bx\,y(\log x)\>dx,\quad\hbox{and}\\\ &\int_{x=a}^b\big(B-x\big)^2\>y(\log x)\>dx\;=\; B^2\,\int_{x=a}^by(\log x)\>dx\;-\;2B\,\int_{x=a}^bx\,y(\log x)\>dx\;+\;\int_{x=a}^bx^2\,y(\log x)\>dx \end{aligned}$$

then

$$\begin{aligned} &I_1\;=\;\big(B-A\big)\,\int_{x=a}^by(\log x)\>dx\;-\;\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx, \qquad\hbox{and}\\\ &I_2\;=\;\big(B-A\big)^2\int_{x=a}^by(\log x)\>dx\;-\; 2\big(B-A)\int_{x=a}^b\big(x-A\big)\>y(\log x)\>dx\;+\; \int_{x=a}^b\big(x-A\big)^2\,y(\log x)\>dx. \end{aligned}$$

Computing Sensitivities

The inverse problem of determining the conductivity and/or susceptibility of the Earth from electromagnetic measurements is nonlinear. Program EM1DTM uses an iterative procedure to solve this problem. At each iteration the linearized approximation of the full nonlinear problem is solved. This requires the Jacobian matrix for the sensitivities, J = (Jσ, Jκ) where:

$$\begin{aligned} \begin{align} J_{ij}^\sigma &= \frac{\partial d_i}{\partial log \, \sigma_j} \\\ J_{ij}^\kappa &= \frac{\partial d_i}{\partial k_j} \end{align} \end{aligned}$$

in which di is the ith observation, and σj and κj are the conductivity and susceptibility of the jth layer.

The algorithm for computing the sensitivities is obtained by differentiating the expressions for the H-fields with respect to the model parameters (Farquharson2003). For example, the sensitivity with respect to mj (either the conductivity or susceptibility of the jth layer) of the z-component of the H-field for a z-directed magnetic dipole source is given by:

$$\frac{\partial H_z}{\partial m_j} (x,y,z,\omega) = \frac{1}{4\pi} \int_0^\infty \Big ( e^{-\lambda |z+h|} + \frac{\partial}{\partial m_j} \Bigg [ \frac{P_{21}}{P_{11}} \Bigg ] e^{\lambda (z-h)} \Big ) \lambda^2 J_0(\lambda r) d\lambda$$

The derivative of the coefficient is simply:

$$\frac{\partial}{\partial m_j} \Bigg [ \frac{P_{21}}{P_{11}} \Bigg ] = \frac{\partial P_{21}}{\partial m_j} \frac{1}{P_{11}} - \frac{\partial P_{11}}{\partial m_j} \frac{P{21}}{P_{11}^2}$$

where P11 and P21 are elements of the propagation matrix P. The derivative of P with respect to mj (for 1 ≤ j ≤ M − 1) is

$$\frac{\partial \mathbf{P}}{\partial m_j} = \mathbf{M_1 M_2 ... M_{j-1}} \Bigg ( \frac{\partial \mathbf{M_j}}{\partial m_j} \mathbf{M_{j+1}} + \mathbf{M_j} \frac{\partial \mathbf{M_{j+1}}}{\partial m_j} \Bigg ) \mathbf{M_{j+2} ... M_M}$$

The sensitivities with respect to the conductivity and susceptibility of the basement halfspace are given by

$$\frac{\partial \mathbf{P}}{\partial m_M} = \mathbf{M_1 M_2 ... M_{M-1}} \frac{\partial \mathbf{M_M}}{\partial m_M}$$

The derivatives of the individual layer matrices with respect to the conductivities and susceptibilities are straightforward to derive, and are not given here.

Just as for the forward modelling, the Hankel transform in eq. Sensitivity_z, and those in the corresponding expressions for the sensitivities of the other observations, are computed using the digital filtering routine of Anderson (Anderson1982).

The partial propagation matrices

$$\mathbf{P_k} = \mathbf{M_1} \prod_{j=2}^k \mathbf{M_j}, \;\;\; k=2,...,M$$

are computed during the forward modelling, and saved for re-use during the sensitivity computations. This sensitivity-equation approach therefore has the efficiency of an adjoint-equation approach.

Inversion Methodologies

In program EM1DTM, there are four different inversion algorithms. They all have the same general formulation <theory_inversion_gen>, but differ in their treatment of the trade-off parameter (see fixed trade-off <theory_inversion_fixed>, discrepency principle <theory_inversion_disc>, GCV <theory_inversion_gcv> and L-curve criterion <theory_inversion_lcurve>).

General formulation

The aim of each inversion algorithm is to construct the simplest model that adequately reproduces the observations. This is achieved by posing the inverse problem as an optimization problem in which we recover the model that minimizes the objective function:


Φ = ϕd + βϕm

The two components of this objective function are as follows. ϕd is the data misfit:


ϕd = Md(Wd(ddobs))

where dobs is the vector containing the N observations, d is the forward-modelled data and Md(x) is some measure of the lenght of a vector x. It is assumed that the noise in the observations is Gaussian and uncorrelated, and that the estimated standard deviation of the noise in the ith observation is of the form s0i, where i indicates the amount of noise in the ith observation relative to that in the others, and is a scale factor that specifies the total amount of noise in the set of observations. The matrix Wd is therefore given by:


Wd = diag{1/(s01), ..., 1/(s0N)}

The model-structure component of the objective function is ϕm. In its most general form it contains four terms:

$$\begin{aligned} \begin{split} \phi_m =& \; \alpha_s^\sigma M_s \left( \mathbf{W_s^\sigma} \big ( \mathbf{m^\sigma - m_s^{\sigma , ref}} \big ) \right) \\\ &+ \alpha_z^\sigma M_z \left( \mathbf{W_z^\sigma} \big ( \mathbf{m^\sigma - m_z^{\sigma , ref}} \big ) \right)\\\ \end{split} \end{aligned}$$

where mσ is the vector containing the logarithms of the layer conductivitiesq. The matrix Wsσ is:

$$\mathbf{W_s^\sigma} = \textrm{diag} \big \{ \sqrt{t_1}, ..., \sqrt{t_{m-1}}, \sqrt{t_{M-1}} \big \}$$

where tj is the thickness of the jth layer. And the matrix Wzσ is:

$$\begin{aligned} \mathbf{W_z^\sigma} = \begin{bmatrix} -\sqrt{\frac{2}{t_1 + t_2}} & \sqrt{\frac{2}{t_1 + t_2}} & & & & \\\ & -\sqrt{\frac{2}{t_2 + t_3}} & \sqrt{\frac{2}{t_2 + t_3}} & & & \\\ & & \ddots & & & \\\ & & & -\sqrt{\frac{2}{t_{M-2} + t_{M-1}}} & \sqrt{\frac{2}{t_{M-2} + t_{M-1}}} & \\\ & & & & -\sqrt{\frac{2}{t_{M-1}}} & \sqrt{\frac{2}{t_{M-1}}} \\\ & & & & & 0 \end{bmatrix} \end{aligned}$$

The rows of any of these two weighting matrices can be scaled if desired (see file for additional model-norm weights<supportingFiles_weight>). The vectors msσ,ref and mzσ,ref contain the layer conductivities for the two possible reference models. The two terms in ϕm therefore correspond to the “smallest” and “flattest” terms for the conductivity parts of the model. The relative importance of the two terms is governed by the coefficients αsσ and αzσ, which are discussed in the Fundamentals of Inversion <http://giftoolscook book.readthedocs.io/en/latest/content/fundamentals/Alphas.html#the-alphas- parameters>. The trade-off parameter :math:beta <http://giftoolscookbook .readthedocs.io/en/latest/content/fundamentals/Beta.html#the-beta-parameter- trade-off> balances the opposing effects of minimizing the misfit and minimizing the amount of structure in the model. It is the different ways in which β is determined that distinguish the four inversion algorithms in program EM1DTM from one another. They are described in the next sections.

General Norm Measures

Program EM1DTM uses general measures of the “length” of a vector instead of the traditional sum-of-squares measure. (For more on the use of general measures in nonlinear inverse problems see Farquharson & Oldenburg, 1998). Specifically, the measure used for the measure of data misfit, Md, is Huber's M -measure:

$$M_d(\mathbf{x}) = \sum_{i=1}^N \rho_H(x_i)$$

where

$$\begin{aligned} \rho_H(x) = \left\{\begin{array}{lr} x^2 & |x| \leq c,\\\ 2cx - c^2 & |x| > c \end{array} \right. \end{aligned}$$

This measure is a composite measure, behaving like a quadratic (i.e., sum-of-squares) measure for elements of the vector less that the parameter c, and behaving like a linear (i.e., l1-norm) measure for elements of the vector larger than c. If c is chosen to be large relative to the elements of the vector, Md will give similar results to those for the sum-of-squares measure. For smaller values of c, for example, when c is only a factor of 2 or so greater than the average size of an element of the vector, Md will act as a robust measure of misfit, and hence be less biased by any outliers or other non-Gaussian noise in the observations.

The measure used for the components of the measure of model structure, Mms and Mmz is the lp-like measure of Ekblom:

$$M_m(\mathbf{x}) = \sum_{j=1}^M \rho_E(x_j)$$

where


ρE(x) = (x2+ϵ2)p/2

The parameter p can be used to vary the behaviour of this measure. For example, with p = 2, this measure behaves like the sum-of-squares measure, and a model constructed using this will have the smeared-out, fuzzy appearance that is characteristic of the sum-of-squares measure. In contrast, for p = 1, this measure does not bias against large jumps in conductivity from one layer to the next, and will result in a piecewise-constant, blocky model. The parameter ϵ is a small number, considerably smaller than the average size of an element of the vector. Its use is to avoid the numerical difficulties for zero-valued elements when p < 2 from which the true lp-norm suffers. In program EM1DTM, the values of p can be different for the smallest and flattest components of ϕm .

General Algorithm

As mentioned in the computing sensitivities <theory_sensitivities> section, the inverse problem considered here is nonlinear. It is solved using an iterative procedure. At the nth iteration, the actual objective function being minimized is:


Φn = ϕdn + βnϕmn

In the data misfit ϕdn, the forward-modelled data dn are the data for the model that is sought at the current iteration. These data are approximated by:


dn = dn − 1 + Jσ, n − 1δm

where δm = mn − mn − 1 and Jn − 1 is the Jacobian matrix given by Sensitivity and evaluated for the model from the previous iteration. At the nth iteration, the problem to be solved is that of finding the change, (δm, δmκ) to the model which minimizes the objective function Φn. Differentiating eq. Objective_Fcn with respect to the components of δm and δmκ, and equating the resulting expressions to zero, gives the system of equations to be solved. This is slightly more involved now that ϕd amd ϕm comprise the Huber's M-measure Huber and Ekblom's lp-like measure, rather than the usual sum-of-squares measures. Specifically, the derivative of HuberMeasure gives:

$$\frac{\partial M_d}{\partial \delta m_k} (\mathbf{x}) = \sum_{i=1}^N \rho^\prime_H (x_i) \frac{\partial x_i}{\partial \delta m_k}$$

where x = Wd(dn − 1 + Jn − 1δm − dobs. The vector of the derivatives with respect to the perturbations of the model parameters in all the layers can be written in matrix notation as:

The linear system of equations to be solved for :math:`delta mathbf{m} is therefore:

$$\begin{aligned} \begin{split} & \bigg [ \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \mathbf{W_d} \mathbf{J}^{n-1} + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \bigg ] \delta \mathbf{m} = \\\ & \mathbf{J}^{n-1 \, T} \mathbf{W_d}^{n-1} \mathbf{W_d} \big ( \mathbf{d^{obs}} - \mathbf{d}^{n-1} \big ) + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{split} \end{aligned}$$

where T denotes the transpose and:

$$\begin{aligned} \begin{split} \mathbf{J}^{n-1} &= \big ( \mathbf{J}^{\sigma , n-1} \mathbf{J}^{\kappa , n-1} \big ) \\\ \mathbf{W_1} &= \begin{bmatrix} \sqrt{\alpha_s^\sigma} \mathbf{W}_s^\sigma & 0 \\ 0 & \sqrt{\alpha_s^\kappa} \mathbf{W}_s^\kappa \end{bmatrix} \\\ \mathbf{W_2} &= \begin{bmatrix} \sqrt{\alpha_z^\sigma} \mathbf{W}_z^\sigma & 0 \\ 0 & \sqrt{\alpha_z^\kappa} \mathbf{W}_z^\kappa \end{bmatrix} \\\ \mathbf{m_1^{ref}} &= \big ( \mathbf{m}_s^{\sigma , ref \, T} \mathbf{m}_s^{\kappa , ref \, T} \big )^T \\\ \mathbf{m_2^{ref}} &= \big ( \mathbf{m}_z^{\sigma , ref \, T} \mathbf{m}_z^{\kappa , ref \, T} \big )^T \\\ \mathbf{\hat{X}}^{n-1} &= \big ( 0 \, (\mathbf{X}^{n-1})^{-1} \big ) \end{split} \end{aligned}$$

where n − 1 = diag{m1κ, n − 1, ..., mMκ, n − 1}. The solution to eq. Systemdm is equivalent to the least-squares solution of:

$$\begin{aligned} \begin{bmatrix} \mathbf{W_d J}^{n-1} \\ \sqrt{\beta^n} \mathbf{W_1} \\ \sqrt{\beta^n} \mathbf{W_2} \\ \sqrt{\gamma^n/2} \, \mathbf{\hat{X}}^{n-1} \end{bmatrix} \delta \mathbf{m} = \begin{bmatrix} \mathbf{W_d } ( \mathbf{d^{obs} - d}^{n-1} ) \\ \sqrt{\beta^n} \mathbf{W_1} ( \mathbf{m_1^{ref} - m}^{n-1} ) \\ \sqrt{\beta^n} \mathbf{W_2}( \mathbf{m^{ref} - m}^{n-1} ) \\ \sqrt{\gamma^n/2} \, \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{bmatrix} \end{aligned}$$

Once the step δm has been determined by the solution of eq. Systemdm or eq. SystemdmLSQ, the new model is given by:


mn = mn − 1 + νδm

There are two conditions on the step length ν. First, if positivity of the layer susceptibilities is being enforced:


νδκj >  − κjn − 1

must hold for all j = 1, ..., M. Secondly, the objective function must be decreased by the addition of the step to the model:


ϕdn + βnϕmn − γnϕLBn < ϕdn − 1 + βnϕmn − 1 − γnϕLBn − 1

where ϕdn is now the misfit computed using the full forward modelling for the new model mn. To determine mn, a step length (ν) of either 1 or the maximum value for which eq. cond1 is true (whichever is greater) is tried. If eq. cond2 is true for the step length, it is accepted. If eq. cond2 is not true, ν is decreased by factors of 2 until it is true.

Algorithm 1: fixed trade-off parameter

The trade-off parameter, β, remains fixed at its user-supplied value throughout the inversion. The least-squares solution of eq. SystemdmLSQ is used. This is computed using the subroutine “LSQR” of Paige & Saunders (Paige1982). If the desired value of β is known, this is the fastest of the four inversion algorithms as it does not involve a line search over trial values of β at each iteration. If the appropriate value of β is not known, it can be found using this algorithm by trail-and-error. This may or may not be time-consuming.

Algorithm 2: discrepancy principle

If a complete description of the noise in a set of observations is available - that is, both s0 and i (i = 1, ..., N) are known - the expectation of the misfit, E(ϕd), is equal to the number of observations N. Algorithm 2 therefore attempts to choose the trade-off parameter so that the misfit for the final model is equal to a target value of chifac × N. If the noise in the observations is well known, chifac should equal 1. However, chifac can be adjusted by the user to give a target misfit appropriate for a particular data-set. If a misfit as small as the target value cannot be achieved, the algorithm searches for the smallest possible misfit.

Experience has shown that choosing the trade-off parameter at early iterations in this way can lead to excessive structure in the model, and that removing this structure once the target (or minimum) misfit has been attained can require a significant number of additional iterations. A restriction is therefore placed on the greatest-allowed decrease in the misfit at any iteration, thus allowing structure to be slowly but steadily introduced into the model. In program EM1DTM, the target misfit at the nth iteration is given by:


ϕdn, tar = max(mfac × ϕdn − 1, chifac × N)

where the user-supplied factor mfac is such that 0.1 ≤ mfac ≤ 0.5.

The step δm is found from the solution of eq. SystemdmLSQ using subroutine LSQR of Paige & Saunders (Paige1982). The line search at each iteration moves along the ϕd versus log β curve until either the target misfit, ϕdn, tar, is bracketed, in which case a bisection search is used to converge to the target, or the minimum misfit ( > ϕdn − 1) is bracketed, in which case a golden section search (for example, Press et al., 1986) is used to converge to the minimum. The starting value of β for each line search is βn − 1. For the first iteration, the β ( = β0) for the line search is given by N/ϕm(m), where m contains typical values of conductivity and/or susceptibility. Specifically, m is a model whose top M/5 layers have a conductivity of 0.02 S/m and susceptibility of 0.02 SI units, and whose remaining layers have a conductivity of 0.01 S/m and susceptibility of 0 SI units. Also, the reference models used in the computation of ϕm(m) are homogeneous halfspaces of 0.01 S/m and 0 SI units. The line search is efficient, but does involve the full forward modelling to compute the misfit for each trial value of β.

Algorithm 3: GCV criterion

If only the relative amount of noise in the observations is known - that is, i(i = 1, ..., N) is known but not s0 -the appropriate target value for the misfit cannot be determined, and hence Algorithm 2 is not the most suitable. The generalized cross-validation (GCV) method provides a means of estimating, during the course of an inversion, a value of the trade-off parameter that results in an appropriate fit to the observations, and in so doing, effectively estimating the level of noise, s0, in the observations (see, for example, Wahba1990; Hansen1998).

The GCV method is based on the following argument (Wahba1990; Haber1997; Haber2000). Consider inverting all but the first observation using a trial value of β, and then computing the individual misfit between the first observation and the first forward-modelled datum for the model produced by the inversion. This can be repeated leaving out all the other observations in turn, inverting the retained observations using the same value of β, and computing the misfit between the observation left out and the corresponding forward-modelled datum. The best value of β can then be defined as the one which gives the smallest sum of all the individual misfits. For a linear problem, this corresponds to minimizing the GCV function. For a nonlinear problem, the GCV method can be applied to the linearized problem being solved at each iteration (Haber1997; Haber2000; Li2003; Farquharson2000). From eq. Systemdm, the GCV function for the nth iteration is given by:

$$GCV (\beta ) = \dfrac{\big \| \mathbf{W_d \hat{d} - W_d J}^{n-1} \mathbf{M}^{-1} \big ( \mathbf{J}^{n-1 \, T} \mathbf{W_d}T \mathbf{W_d \hat{d} + r} \big ) \big \|^2 }{\big [ \textrm{trace} \big ( \mathbf{I - W_d J}^{n-1} \mathbf{M}^{-1} \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \big ) \big ]^2}$$

where

$$\begin{aligned} \begin{split} \mathbf{M} (\beta) &= \bigg [ \mathbf{J}^{n-1 \, T} \mathbf{W_d}^T \mathbf{W_d} \mathbf{J}^{n-1} + \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \bigg ] \\\ \mathbf{r} &= \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) + \frac{\gamma^n}{2} \mathbf{\hat{X}}^{n-1 \, T} \mathbf{\hat{X}}^{n-1} \mathbf{m}^{n-1} \end{split} \end{aligned}$$

and dobsdn − 1. If β* is the value of the trade-off parameter that minimizes eq. GCV at the nth iteration, the actual value of β used to compute the new model is given by:


βn = max(β*, bfac × βn − 1)

where the user-supplied factor bfac is such that 0.01 < bfac < 0.5. As for Algorithm 2, this limit on the allowed decrease in the trade-off parameter prevents unnecessary structure being introduced into the model at early iterations. The inverse of the matrix M required in eq. GCV, and the solution to eq. Systemdm given this inverse, is computed using the Cholesky factorization routines from LAPACK (Anderson1999). The line search at each iteration moves along the curve of the GCV function versus the logarithm of the trade-off parameter until the minimum is bracketed (or bfac × βn − 1 reached), and then a golden section search (e.g., Press et al., 1986) is used to converge to the minimum. The starting value of β in the line search is βn − 1 ( β0 is estimated in the same way as for Algorithm 2). This is an efficient search, even with the inversion of the matrix M.

Algorithm 4: L-curve criterion

As for the GCV-based method <theory_inversion_gcv>, the L-curve method provides a means of estimating an appropriate value of the trade-off parameter if only i, i = 1, ..., N are known and not s0. For a linear inverse problem, if the data misfit ϕd is plotted against the model norm ϕm for all reasonable values of the trade-off parameter β, the resulting curve tends to have a characteristic "L"-shape, especially when plotted on logarithmic axes (see, for example, Hansen1998). The corner of this L-curve corresponds to roughly equal emphasis on the misfit and model norm during the inversion. Moving along the L-curve away from the corner is associated with a progressively smaller decrease in the misfit for large increases in the model norm, or a progressively smaller decrease in the model norm for large increases in the misfit. The value of β at the point of maximum curvature on the L-curve is therefore the most appropriate, according to this criterion.

For a nonlinear problem, the L-curve criterion can be applied to the linearized inverse problem at each iteration (Li & Oldenburg, 1999; Farquharson & Oldenburg, 2000). In this situation, the L-curve is defined using the linearized misfit, which uses the approximation given in eq. DataPerturb for the forward-modelled data. The curvature of the L-curve is computed using the formula (Hansen, 1998):

$$C(\beta) = \frac{\zeta^\prime \eta^{\prime \prime } - \zeta^{\prime\prime} \eta^\prime}{\big ( (\zeta^\prime)^2 + (\eta^\prime)^2 \big )^{3/2}}$$

where ζ = log ϕdlin and η = log ϕm. The prime denotes differentiation with respect to log β. As for both Algorithms 2 <theory_inversion_disc> & 3 <theory_inversion_gcv>, a restriction is imposed on how quickly the trade-off parameter can be decreased from one iteration to the next. The actual value of β chosen for use at the nth th iteration is given by eq. betachoice, where β* now corresponds to the value of β at the point of maximum curvature on the L-curve.

Experience has shown that the L-curve for the inverse problem considered here does not always have a sharp, distinct corner. The associated slow variation of the curvature with β can make the numerical differentiation required to evaluate eq. zetaeq prone to numerical noise. The line search along the L-curve used in program EM1DTM to find the point of maximum curvature is therefore designed to be robust (rather than efficient). The L-curve is sampled at equally-spaced values of log β, and long differences are used in the evaluation of eq. zetaeq to introduce some smoothing. A parabola is fit through the point from the equally-spaced sampling with the maximum value of curvature and its two nearest neighbours. The value of β at the maximum of this parabola is taken as β*. In addition, it is sometimes found that, for the range of values of β that are tried, the maximum value of the curvature of the L-curve on logarithmic axes is negative. In this case, the curvature of the L-curve on linear axes is investigated to find a maximum. As for Algorithms 1 & 2, the least-squares solution to eq. SystemdmLSQ is used, and is computed using subroutine LSQR of Paige & Saunders (Paige1982).

Relative weighting within the model norm

The four coefficients in the model norm (see eq. MOF) are ultimately the responsibility of the user. Larger values of αsσ relative to αzσ result in constructed conductivity models that are closer to the supplied reference model. Smaller values of αsσ and αzσ result in flatter conductivity models. Likewise for the coefficients related to susceptibilities.

If both conductivity and susceptibility are active in the inversion, the relative size of αsσ & αzσ to αsκ & αzκ is also required. Program EM1DTM includes a simple means of calculating a default value for this relative balance. Using the layer thicknesses, weighting matrices Wsσ, Wzσ, Wsκ & Wzκ, and user-supplied weighting of the smallest and flattest parts of the conductivity and susceptibility components of the model norm (see acs, acz, ass & asz in the input file description<inputEM1DTM>, line 5), the following two quantities are computed for a test model m*:

$$\begin{aligned} \begin{split} \phi_m^\sigma &= acs \big \| \mathbf{W_s^\sigma} \big ( \mathbf{m}^* - \mathbf{m}_s^{\sigma, ref} \big ) \big \|^2 + acz \big \| \mathbf{W_z^\sigma} \big ( \mathbf{m}^* - \mathbf{m}_z^{\sigma, ref} \big ) \big \|^2 \\\ \phi_m^\kappa &= ass \big \| \mathbf{W_s^\kappa} \big ( \mathbf{m}^* - \mathbf{m}_s^{\kappa, ref} \big ) \big \|^2 + asz \big \| \mathbf{W_z^\kappa} \big ( \mathbf{m}^* - \mathbf{m}_z^{\kappa, ref} \big ) \big \|^2 \end{split} \end{aligned}$$

The conductivity and susceptibility of the top N/5 layers in the test model are 0.02 S/m and 0.02 SI units respectively, and the conductivity and susceptibility of the remaining layers are 0.01 S/m and 0 SI units. The coefficients of the model norm used in the inversion are then αsσ = acs, αzσ = acz, αsκ = As × ass & αzκ = Ad × asz where Asϕmσ/ϕmκ. It has been found that a balance between the conductivity and susceptibility portions of the model norm computed in this way is adequate as an initial guess. However, the balance usually requires modification by the user to obtain the best susceptibility model. (The conductivity model tends to be insensitive to this balance.) If anything, the default balance will suppress the constructed susceptibility model.

Positive susceptibility

ProgramEM1DTM can perform an unconstrained inversion for susceptibilities (along with the conductivities) as well as invert for values of susceptibility that are constrained to be positive. Following Li & Oldenburg (Li2003), the positivity constraint is implemented by incorporating a logarithmic barrier term in the objective function (see eqs. ObjectiveFun & barrier_cond). For the initial iteration, the coefficient of the logarithmic barrier term is chosen so that this term is of equal important to the rest of the objective function:

$$\gamma^0 = \frac{\phi_d^0 + \beta^0 \phi_m^0}{- \phi^0_{LB}}$$

At subsequent iterations, the coefficient is reduced according to the formula:


γn = (1 − min(νn − 1, 0.925))γn − 1

where νn − 1 is the step length used at the previous iteration. As mentioned at the end of the general formulation <theory_inversion_gen>, when positivity is being enforced, the step length at any particular iteration must satisfy eq. cond1.

Convergence criteria

To determine when an inversion algorithm has converged, the following criteria are used (Gill1981):

$$\begin{aligned} \begin{split} \Phi^{n-1} - \Phi^n &< \tau (1 + \Phi^n )\\\ \| \mathbf{m}^{n-1} - \mathbf{m} \| &< \sqrt{\tau} (1 + \| \mathbf{m}^n \| ) \end{split} \end{aligned}$$

where τ is a user-specified parameter. The algorithm is considered to have converged when both of the above equations are satisfied. The default value of τ is 0.01.

In case the algorithm happens directly upon the minimum, an additional condition is tested:


gn∥ ≤ ϵ

where ϵ is a small number close to zero, and where the gradient, gn, at the nth iteration is given by:

$$\mathbf{g}^n = -2 \mathbf{J}^{n \, T} \mathbf{W_d}^T \mathbf{W_d} ( \mathbf{d^{obs}} - \mathbf{d}^n ) - 2 \beta^n \sum_{i=1}^2 \mathbf{W_i}^T \mathbf{W_i} \big ( \mathbf{m_i^{ref} - \mathbf{m}^{n-1}} \big ) - \gamma^n \mathbf{\hat{X}}^{n2T} \mathbf{m}^{n}$$